Bug #11861

Updated by Joshua M. Clulow 8 months ago

Robert Mustacchi reached out to me re an issue seen on Dilos, where the hostbridge topo enumerator module dumped core:

The underlying issue was that the system was enumerating empty pci busses, which was breaking assumptions in the hostbridge topo module.

Below is Paul Winder's explanation for why we're enumerating empty busses:

> <pre>
This looks like a problem we came across when testing on different platforms.

FWIR, when enumerating PCI buses, it tries to match PCI root buses to the ACPI device using the bus from the @_BBN@ _BBN method. After it has done that it scans all the @_BBN@ _BBN and any which do match an already discovered bus are declared as empty buses. On some BIOS’s the @_BBN@ _BBN do *not* match the real PCI bus, and you can end up with ghost/empty PCI buses - which the topology libraries don’t like.

We solved this by using the ACPI @_CRS@ _CRS (current resource) to get the correct PCI bus number.

Eg A system with ghost buses:
> <pre>

$ ls /devices/
agpgart pci@31,0 pci@5c,0 pci@91,0
agpgart:agpgart pci@31,0:devctl pci@5c,0:devctl pci@91,0:devctl
fw pci@31,0:intr pci@5c,0:intr pci@91,0:intr
options pci@31,0:reg pci@5c,0:reg pci@91,0:reg
pci@0,0 pci@3a,0 pci@61,0 pci@98,0
pci@0,0:devctl pci@3a,0:devctl pci@61,0:devctl pci@98,0:devctl
pci@0,0:intr pci@3a,0:intr pci@61,0:intr pci@98,0:intr
pci@0,0:reg pci@3a,0:reg pci@61,0:reg pci@98,0:reg
pci@14,0 pci@3c,0 pci@70,0 pseudo
pci@14,0:devctl pci@3c,0:devctl pci@70,0:devctl pseudo:devctl
pci@14,0:intr pci@3c,0:intr pci@70,0:intr scsi_vhci
pci@14,0:reg pci@3c,0:reg pci@70,0:reg scsi_vhci:devctl

And the same system when using the _CRS method:
> <pre>

# ls /devices/
agpgart pci@14,0:intr pci@3c,0 pci@61,0:intr
agpgart:agpgart pci@14,0:reg pci@3c,0:devctl pci@61,0:reg
fw pci@31,0 pci@3c,0:intr pci@70,0
options pci@31,0:devctl pci@3c,0:reg pci@70,0:devctl
pci@0,0 pci@31,0:intr pci@5c,0 pci@70,0:intr
pci@0,0:devctl pci@31,0:reg pci@5c,0:devctl pci@70,0:reg
pci@0,0:intr pci@3a,0 pci@5c,0:intr pseudo
pci@0,0:reg pci@3a,0:devctl pci@5c,0:reg pseudo:devctl
pci@14,0 pci@3a,0:intr pci@61,0 scsi_vhci
pci@14,0:devctl pci@3a,0:reg pci@61,0:devctl scsi_vhci:devctl

The code has been tested on a few platforms which *don’t* present the problem, and on the platform we found the problem.

Integrating Paul's fix for above is tracked by #11860 issue#11860

Regardless of the underlying issue described above, the hostbridge topo module should be hardened such that it doesn't fall over in this situation. This is what we are tracking with this issue.