Bug #12015
closedvioif with MSI-X not working on Google Compute Engine
100%
Description
In an OpenIndiana guest running under Google Compute Engine (GCE), it would appear that no packets are received on the vioif
interface in the guest. Adding a periodic function that forces a poll of the receive queue every second allows packets to begin flowing, but with the obvious challenge of up to a second of latency on each. Merely getting the flow of packets started by polling does not appear to then lead to RX interrupts showing up later.
After forcing the vioif
driver to use fixed interrupts instead of MSI-X, the guest then works as expected. In the same guest, a prototype vioscsi
driver which uses MSI-X interrupts does function correctly. It's not yet clear exactly what's going on, but without visibility into the bespoke hypervisor it's hard to know what to investigate next.
In the short term, we can use the SMBIOS data to detect that we're running under GCE and force fixed interrupts for vioif
.
Related issues
Updated by Joshua M. Clulow about 4 years ago
Testing Notes¶
Regression Checks¶
In an OI guest running under SmartOS, I updated to bits with this change and checked for MSI-X interrupts:
root@oi0:~# nm /system/object/vioif/object | grep vioif_select_interrupt_types [43] |18446744073574316928| 153|FUNC |LOCL |0 |ABS |vioif_select_interrupt_types root@oi0:~# mdb -ke ::interrupts | grep vioif 25 0x60 6 PCI Edg MSI-X 0 1 - vioif_rx_handler 26 0x61 6 PCI Edg MSI-X 1 1 - vioif_tx_handler root@oi0:~# mdb -ke vioif_allowed_int_types/D vioif_allowed_int_types: vioif_allowed_int_types: -1
I then added this entry to /etc/system
and rebooted:
set vioif:vioif_allowed_int_types = 0x1
Confirmed that the handlers are now called through the shared fixed interrupt:
root@oi0:~# mdb -ke ::interrupts | grep vioif root@oi0:~# dtrace -q -n 'vioif_tx_handler:entry,vioif_rx_handler:entry { @[probefunc,stack()] = count(); }' -c 'sleep 5' vioif_rx_handler virtio`virtio_shared_isr+0xa8 unix`av_dispatch_autovect+0x83 unix`dispatch_hardint+0x36 unix`switch_sp_and_call+0x15 2 vioif_tx_handler virtio`virtio_shared_isr+0xa8 unix`av_dispatch_autovect+0x83 unix`dispatch_hardint+0x36 unix`switch_sp_and_call+0x15 3
And that this does not affect vioblk
:
root@oi0:~# mdb -ke ::interrupts | grep viob 24 0x40 5 PCI Edg MSI-X 2 1 - vioblk_int_handler 25 0x43 5 PCI Edg MSI-X 1 1 - vioblk_int_handler
Checks under GCE¶
In an OpenIndiana guest running under GCE, I updated to these bits and did the following checks:
root@gce:~# mdb -ke ::interrupts | grep vioif root@gce:~# mdb -ke ::interrupts | grep virt 0/0x27 11 6 PCI Lvl Fixed 1 0x0/0xb virtio_shared_isr root@gce:~# mdb -ke vioif_allowed_int_types/D vioif_allowed_int_types: vioif_allowed_int_types: -1 root@gce:~# curl -H 'Metadata-Flavor:Google' http://metadata.google.internal/computeMetadata/v1/instance/network-interfaces/0/access-configs/0/external-ip ; echo 35.212.181.92 # while pinging from outside: root@gce:~# dtrace -q -n 'vioif_tx_handler:entry,vioif_rx_handler:entry { @[probefunc,stack()] = count(); }' -c 'sleep 5' vioif_rx_handler virtio`virtio_shared_isr+0xa8 apix`apix_dispatch_pending_autovect+0xef apix`apix_dispatch_pending_hardint+0x34 unix`switch_sp_and_call+0x15 5 vioif_tx_handler virtio`virtio_shared_isr+0xa8 apix`apix_dispatch_pending_autovect+0xef apix`apix_dispatch_pending_hardint+0x34 unix`switch_sp_and_call+0x15 5
Ping latency from outside demonstrates that interrupts are prompt:
64 bytes from 92.181.212.35.bc.googleusercontent.com (35.212.181.92): icmp_seq=107. time=38.939 ms 64 bytes from 92.181.212.35.bc.googleusercontent.com (35.212.181.92): icmp_seq=108. time=38.602 ms 64 bytes from 92.181.212.35.bc.googleusercontent.com (35.212.181.92): icmp_seq=109. time=40.289 ms 64 bytes from 92.181.212.35.bc.googleusercontent.com (35.212.181.92): icmp_seq=110. time=36.968 ms 64 bytes from 92.181.212.35.bc.googleusercontent.com (35.212.181.92): icmp_seq=111. time=38.634 ms 64 bytes from 92.181.212.35.bc.googleusercontent.com (35.212.181.92): icmp_seq=112. time=39.316 ms
Checking to make sure we can override the GCE workaround:
root@gce:~# ed /etc/system 3161 $ $-10,$n 100 * Examples: 101 * 102 * To set variables in 'unix': 103 * 104 * set nautopush=32 105 * set maxusers=40 106 * 107 * To set a variable named 'debug' in the module named 'test_module' 108 * 109 * set test_module:debug = 0x13 110 110a set vioif:vioif_allowed_int_types = 0x0 . $-4,$n 107 * To set a variable named 'debug' in the module named 'test_module' 108 * 109 * set test_module:debug = 0x13 110 111 set vioif:vioif_allowed_int_types = 0x0 w 3201 q root@gce:~# reboot ...
After reboot, the interrupt handlers were once again MSI-X and networking no longer works:
root@gce:~# mdb -ke ::interrupts | grep vioif 0/0x27 - 6 PCI Edg MSI-X 1 - vioif_rx_handler 0/0x28 - 6 PCI Edg MSI-X 1 - vioif_tx_handler
Updated by Electric Monk about 4 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit aefa9c84b0a900bedba3a7ed885f0ea75f3fe1ea
commit aefa9c84b0a900bedba3a7ed885f0ea75f3fe1ea Author: Joshua M. Clulow <josh@sysmgr.org> Date: 2019-11-24T17:14:54.000Z 12015 vioif with MSI-X not working on Google Compute Engine Reviewed by: Gordon Ross <Gordon.W.Ross@gmail.com> Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: Rick McNeal <rick.mcneal@nexenta.com> Approved by: Dan McDonald <danmcd@joyent.com>
Updated by Yuka Poppe about 4 years ago
I do not currently have any data to back this up, nor have I been in the code the last 8 months: However, while I was working on new vioscsi code with trisk's code as a reference; I noticed that interrupts would fire for one virtq, but the actual data would end up in one of the other virtq's.
Unfortunately I've not had the time to investigate wether these suspicions were correct, nor do I recall more exact details. I'm unfortunately lousy in documenting problems.
I believe I was in the process of tracking it to the interrupt registration handler code -- I also believe I was unsure if the problem was with illumos' code or the linux side on the hypervisor. The problem showed itself at the very least when one would have both config handlers and multiple receive virtq's .
update: Now that I think some more on it; I had strong suspicions it had to do with handler registration ordering, or the number of descriptor tables, or even/odd numbering. It might be that GCP uses a different amount of descriptor tables/queue's for the network driver (where other clouds run with the defaults) which hits the same bug as the vioscsi implementation, (the specification requires at least three virtq's)
Kind regards,
Yuka Poppe
Updated by Joshua M. Clulow over 2 years ago
- Related to Bug #14012: vioif simply cannot without SMBIOS added