Bug #14603
closedbhyve passthrough mishandles combined CMD/STATUS reads
100%
Description
An OmniOS user reported that booting a bhyve guest with an XHCI controller passed-through to it failed to boot with the updated uefi-edk2-stable-202111 bootrom, printing errors and dropping into a loop:
XhcReadCapReg: Pci Io read error - Invalid Parameter at 0 XhcReadCapReg: Pci Io read error - Invalid Parameter at 4 XhcReadCapReg: Pci Io read error - Invalid Parameter at 8 XhcReadCapReg: Pci Io read error - Invalid Parameter at 16 XhcReadCapReg: Pci Io read error - Invalid Parameter at 20 XhcReadCapReg: Pci Io read error - Invalid Parameter at 24 XhcReadOpReg: Pci Io Read error - Unsupported at 8 XhcReadExtCapReg: Pci Io Read error - Unsupported at 0 XhcReadExtCapReg: Pci Io Read error - Unsupported at 1020 XhcReadExtCapReg: Pci Io Read error - Unsupported at 2040 XhcReadExtCapReg: Pci Io Read error - Unsupported at 3060 XhcReadExtCapReg: Pci Io Read error - Unsupported at 4080 ...
The user bisected the bootrom between the working stable-202102 version and this one and found that the problem was introduced with:
commit 1e0c441c922d92dae736f1b47df05373490045ae Date: Mon Jul 5 13:08:41 2021 +0200 OvmfPkg/Bhyve: add USB support
So this is not necessarily a bug in the bootrom, just something exposed by adding the USB drivers to the UEFI environment.
Adding some debug lines and watching the ROM's DEBUG output shows this:
Bhyve's pci_emul says: Assigned BAR[1] Mem32 c0000000 + 2000 Assigned BAR[0] IO 2000 + 20 Assigned BAR[0] Mem64 c0010000 + 10000 <-- XHCI device pci.0.31.7: IO=2000 - 2040 MEM32=c0000000 - c0200000 MEM64=800000000 - 820000000 (bhyve deliberately adds slop to the end to give the guest flexibility) Bootrom says: RootBridge: PciRoot(0x0) Support/Attr: 10003 / 10003 DmaAbove4G: No NoExtConfSpace: Yes AllocAttr: 0 () Bus: 0 - 0 Translation=0 Io: 2000 - 201F Translation=0 Mem: C0000000 - C0001FFF Translation=0 MemAbove4G: FFFFFFFFFFFFFFFF - 0 Translation=0 PMem: FFFFFFFFFFFFFFFF - 0 Translation=0 PMemAbove4G: FFFFFFFFFFFFFFFF - 0 Translation=0 PciBus: Discovered PCI @ [00|00|00] PciBus: Discovered PCI @ [00|01|00] PciBus: Discovered PCI @ [00|07|00] BAR[0]: Type = Io16; Alignment = 0x1F; Length = 0x20; Offset = 0x10 Baseaddress = 0x2000 BAR[1]: Type = Mem32; Alignment = 0x1FFF; Length = 0x2000; Offset = 0x14 Baseaddress = 0xC0000000 PciBus: Discovered PCI @ [00|09|00] BAR[0]: Type = Mem64; Alignment = 0xFFFF; Length = 0x10000; Offset = 0x10; Baseaddress = 0xC0010000
The fundamental problem here is that the memory range assigned to the PCI Root Bridge does not encompass the memory assigned to the XHCI device
causing accesses to the XHCI BAR to fail boundary checks.
The memory range is constructed, in the absence of generated and loaded ACPI tables, by scanning the BARs of the enumerated devices.
Adding some more debug statements there:
ParseBars(0/0/0) - command 4 ParseBars(0/1/0) - command 4 ParseBars(0/7/0) - command 7 IO -- 2000 - 201F MEM -- C0010000 - C0011FFF ParseBars(0/9/0) - command 290
That's an odd command register value...
# /usr/lib/pci/pcieadm show-cfgspace -d ppt1 header0.command header0.status Device ppt1 -- Type 0 Header Command: 0x6 Status: 0x290
it just happens to match the status register.
When a guest does a combined read on the command and status register for a passthrough device, the returned value is in the wrong order with the command and status registers swapped.
This was fixed upstream in FreeBSD as part of https://github.com/freebsd/freebsd-src/commit/2136849868a0875eade90a4ce8bfd5df054a74bb
but the specific fix was missed during the May 2020 bhyve sync. I've checked and the rest of the commit was pulled into illumos.
Updated by Andy Fiddaman 3 months ago
This has been in OmniOS bloody since March and tested with igb network cards and xhci devices passed through to guests and booted with a UEFI bootrom that has xhci drivers. The original bug reporter has also tested this and confirmed that the problem is resolved.
Updated by Electric Monk 3 months ago
- Status changed from In Progress to Closed
- % Done changed from 0 to 100
git commit b518543be8042a5a0dda9b983f71c4a99d74ad99
commit b518543be8042a5a0dda9b983f71c4a99d74ad99 Author: Andy Fiddaman <omnios@citrus-it.co.uk> Date: 2022-05-12T21:22:00.000Z 14603 bhyve passthrough mishandles combined CMD/STATUS reads Reviewed by: Jorge Schrauwen <registration@blackdot.be> Reviewed by: Jason King <jason.brian.king+illumos@gmail.com> Reviewed by: Toomas Soome <tsoome@me.com> Approved by: Dan McDonald <danmcd@mnx.io>