Project

General

Profile

Bug #1160

fmd assertion fails on boot on Dell C6145

Added by Rich Ercolani about 9 years ago. Updated about 5 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Start date:
2011-06-27
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

On fresh boot on a Dell C6145 (in this case, a 4x12-core Opteron box), fmd dies on boot with "Assertion failed: comma != NULL, file ../../common/pcibus/did_props.c, line 401, function dev_for_hostbridge"

I suspect that http://opensolaris.org/jive/thread.jspa?threadID=117229 is the same bug - since the C6145 has an external PCIe port for connecting to e.g. a Dell C410x external PCIe chassis.

oi_148, will attach any dumps residing on the machine, as well as prtconf and similar, as soon as I bring up networking on the machine.


Files

ls_devices.gz (4.06 KB) ls_devices.gz ls -lR /devices output Rich Ercolani, 2011-06-27 09:45 PM
prtconf.gz (538 Bytes) prtconf.gz prtconf output Rich Ercolani, 2011-06-27 09:45 PM
prtconf_p.gz (351 Bytes) prtconf_p.gz prtconf -p output Rich Ercolani, 2011-06-27 09:45 PM

History

#1

Updated by Rich Lowe about 9 years ago

In the mentioned thread, Mike mentions the range between his laptop working and not working was onnv_123 to onnv_126, within that range falls:

changeset:   3986:8f772ca46fa3
user:        myers
date:        Fri Apr 06 15:57:19 2007 -0700
description:
        6472670 Internal pci bus numbering should be independent of bios enumera
tion order (for non-x8400 systems)

Which changed PCI enumeration and made it persist. Some of the code explicitly states that it will now discover childless root-bridges which would previously missed. Seems suspiciously fitting.

I don't know much about this area, but I would suspect that the right thing to be doing is making fmd more robust.

#2

Updated by Rich Ercolani about 9 years ago

Core dumps can be found at:
http://skysrv.pha.jhu.edu/~rercola/bitbucket2/

As well as copies of the attached outputs

#3

Updated by Rich Lowe about 9 years ago

  • Project changed from OpenIndiana Distribution to illumos gate
#4

Updated by Rich Lowe about 9 years ago

As the fm-discuss thread suggests, it looks like we assume any node which
pci_hostbridges_find discovers also have a driver attached
(by way of DEVprop_set, which pulls their devfs path and lets dev_for_hostbridge attempt to adjust it).

It looks like the path we get back from di_devfs_path is "/pci", because we have no parent nodes and di_bus_addr() is NULL, which does, I think, point toward no driver being attached as the linked thread states.

I think this means that fmd needs to be more resilient, as above, rather than enumeration being incorrect.

#5

Updated by Albert Lee over 8 years ago

Seeing this on my ThinkPad T410 now with the wireless card removed.

#6

Updated by Rick Sayre about 5 years ago

I've been seeing this on one specific hardware configuration since bringup.
Currently on omnios-d08e0e5 [OmniOS r151014]

This is running on a HP Compaq 6005 Pro

Any progress on this bug?

Also available in: Atom PDF