Bug #8977
closedipmi enumerator doesn't always enumerate nested entities
100%
Description
The ipmi topo enumerator module assumes that if a PSU or FAN entity is nested, it will be nested under either a POWER_DOMAIN or COOLING_DOMAIN entity, which does seem logical.
Unfortunately, we've seen at least one case (Dell R730) where the PSU entities are nested under a MOTHERBOARD entity (go figure). I also recall seeing something similar on an Oracle server platform years ago (in that case the PSU entities were nested under the SYSTEM_CHASSIS entity).
So clearly, the ipmi enumerator is making assumptions that don't always hold true in real life. The end result is that it can fail to enumerate the FAN and/or PSU topologies in the topo snapshot. This is a bug and this CR is to make the logic more flexible such that it always looks for nested PSU/FAN entities regardless of the entity type of the parent entity.
Updated by Rob Johnston over 4 years ago
Testing
I installed a platform image that included the changes for this CR on a Dell R730 (aka J3302) in the SF lab and verified that with the changes, the power supplies (PSUs) were now being enumerated in the topo snapshot:
[root@volcano ~]# /usr/lib/fm/fmd/fmtopo "*psu=*" TIME UUID Dec 12 19:43:37 d74800e5-6b32-445c-e150-e6e356ef4a76 hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VDG:part=05RHVVA00/chassis=0/psu=0 hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VDG:part=05RHVVA00/chassis=0/psu=0?sensor=Presence hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VDG:part=05RHVVA00/chassis=0/psu=0?sensor=Current 1 hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VDG:part=05RHVVA00/chassis=0/psu=0?sensor=Voltage 1 hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VDG:part=05RHVVA00/chassis=0/psu=0?sensor=Status hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VCL:part=05RHVVA00/chassis=0/psu=1 hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VCL:part=05RHVVA00/chassis=0/psu=1?sensor=Presence hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VCL:part=05RHVVA00/chassis=0/psu=1?sensor=Current 2 hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VCL:part=05RHVVA00/chassis=0/psu=1?sensor=Voltage 2 hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VCL:part=05RHVVA00/chassis=0/psu=1?sensor=Status
[root@volcano ~]# /usr/lib/fm/fmd/fmtopo -V "*psu=0*" TIME UUID Dec 12 19:44:35 3a171cdd-a3c6-edd1-d3c2-c16ceb515838 hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VDG:part=05RHVVA00/chassis=0/psu=0 group: protocol version: 1 stability: Private/Private resource fmri hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VDG:part=05RHVVA00/chassis=0/psu=0 label string PSU 0 FRU fmri hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VDG:part=05RHVVA00/chassis=0/psu=0 group: authority version: 1 stability: Private/Private product-id string Joyent-Compute-Platform-3302 chassis-id string FHL5TD2 server-id string volcano group: ipmi version: 1 stability: Private/Private entity-id uint32 0xa entity-instance uint32 0x1 hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VDG:part=05RHVVA00/chassis=0/psu=0?sensor=Presence group: protocol version: 1 stability: Private/Private resource fmri hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VDG:part=05RHVVA00/chassis=0/psu=0?sensor=Presence group: authority version: 1 stability: Private/Private product-id string Joyent-Compute-Platform-3302 chassis-id string FHL5TD2 server-id string volcano group: facility version: 1 stability: Private/Private entity_ref string[] [ "Presence" ] sensor-class string discrete type uint32 0x25 (PRESENCE) state uint32 0x8002 (ABSENT) hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VDG:part=05RHVVA00/chassis=0/psu=0?sensor=Current 1 group: protocol version: 1 stability: Private/Private resource fmri hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VDG:part=05RHVVA00/chassis=0/psu=0?sensor=Current 1 group: authority version: 1 stability: Private/Private product-id string Joyent-Compute-Platform-3302 chassis-id string FHL5TD2 server-id string volcano group: facility version: 1 stability: Private/Private entity_ref string[] [ "Current 1" ] sensor-class string threshold type uint32 0x101 (THRESHOLD_STATE) state uint32 0xc0 (0xc0) reading double 2.000000 units uint32 0x5 (AMPS) hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VDG:part=05RHVVA00/chassis=0/psu=0?sensor=Voltage 1 group: protocol version: 1 stability: Private/Private resource fmri hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VDG:part=05RHVVA00/chassis=0/psu=0?sensor=Voltage 1 group: authority version: 1 stability: Private/Private product-id string Joyent-Compute-Platform-3302 chassis-id string FHL5TD2 server-id string volcano group: facility version: 1 stability: Private/Private entity_ref string[] [ "Voltage 1" ] sensor-class string threshold type uint32 0x101 (THRESHOLD_STATE) state uint32 0xc0 (0xc0) reading double 112.000000 units uint32 0x4 (VOLTS) hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VDG:part=05RHVVA00/chassis=0/psu=0?sensor=Status group: protocol version: 1 stability: Private/Private resource fmri hc://:product-id=Joyent-Compute-Platform-3302:server-id=volcano:chassis-id=FHL5TD2:serial=CN1797268M9VDG:part=05RHVVA00/chassis=0/psu=0?sensor=Status group: authority version: 1 stability: Private/Private product-id string Joyent-Compute-Platform-3302 chassis-id string FHL5TD2 server-id string volcano group: facility version: 1 stability: Private/Private entity_ref string[] [ "Status" ] sensor-class string discrete type uint32 0x8 (POWER_SUPPLY) state uint32 0x8080 (0x80)
I also ran fmtopo in a mode where it drops a core at the end, and checked that there were no memory leaks introduced by these changes.
Updated by Electric Monk over 4 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit 8f022dd6c1ebe3edc269726bf537617e665df32f
commit 8f022dd6c1ebe3edc269726bf537617e665df32f Author: Rob Johnston <rob.johnston@joyent.com> Date: 2018-01-23T21:33:23.000Z 8967 libipmi: add support for GET_CHASSIS_STATUS command 8974 fac_prov_ipmi should support binding by entity id and instance 8975 ipmi topo plugin should automatically enumerate sensors on nodes it enumerates 8976 ipmi enumerator should include FRU identity information in FMRI authority 8977 ipmi enumerator doesn't always enumerate nested entities 8978 Add topo facility method for controlling chassis ident indicator Reviewed by: Yuri Pankov <yuripv@icloud.com> Reviewed by: Ben Sims <bensims@gmail.com> Approved by: Dan McDonald <danmcd@joyent.com>