Bug #1160
fmd assertion fails on boot on Dell C6145
0%
Description
On fresh boot on a Dell C6145 (in this case, a 4x12-core Opteron box), fmd dies on boot with "Assertion failed: comma != NULL, file ../../common/pcibus/did_props.c, line 401, function dev_for_hostbridge"
I suspect that http://opensolaris.org/jive/thread.jspa?threadID=117229 is the same bug - since the C6145 has an external PCIe port for connecting to e.g. a Dell C410x external PCIe chassis.
oi_148, will attach any dumps residing on the machine, as well as prtconf and similar, as soon as I bring up networking on the machine.
Files
Updated by Rich Lowe over 9 years ago
In the mentioned thread, Mike mentions the range between his laptop working and not working was onnv_123 to onnv_126, within that range falls:
changeset: 3986:8f772ca46fa3 user: myers date: Fri Apr 06 15:57:19 2007 -0700 description: 6472670 Internal pci bus numbering should be independent of bios enumera tion order (for non-x8400 systems)
Which changed PCI enumeration and made it persist. Some of the code explicitly states that it will now discover childless root-bridges which would previously missed. Seems suspiciously fitting.
I don't know much about this area, but I would suspect that the right thing to be doing is making fmd more robust.
Updated by Rich Ercolani over 9 years ago
- File ls_devices.gz ls_devices.gz added
- File prtconf.gz prtconf.gz added
- File prtconf_p.gz prtconf_p.gz added
Core dumps can be found at:
http://skysrv.pha.jhu.edu/~rercola/bitbucket2/
As well as copies of the attached outputs
Updated by Rich Lowe over 9 years ago
- Project changed from OpenIndiana Distribution to illumos gate
Updated by Rich Lowe over 9 years ago
As the fm-discuss thread suggests, it looks like we assume any node which
pci_hostbridges_find discovers also have a driver attached
(by way of DEVprop_set, which pulls their devfs path and lets dev_for_hostbridge attempt to adjust it).
It looks like the path we get back from di_devfs_path is "/pci", because we have no parent nodes and di_bus_addr() is NULL, which does, I think, point toward no driver being attached as the linked thread states.
I think this means that fmd needs to be more resilient, as above, rather than enumeration being incorrect.
Updated by Albert Lee almost 9 years ago
Seeing this on my ThinkPad T410 now with the wireless card removed.
Updated by Rick Sayre over 5 years ago
I've been seeing this on one specific hardware configuration since bringup.
Currently on omnios-d08e0e5 [OmniOS r151014]
This is running on a HP Compaq 6005 Pro
Any progress on this bug?