Project

General

Profile

Bug #11818

IPMI topo plugin shouldn't return data from unavailable sensors

Added by Matthias Scheler 5 months ago. Updated 4 months ago.

Status:
Closed
Priority:
Normal
Category:
lib - userland libraries
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:

Description

On a Supermicro X9 2U fmtopo will report six chassis fan sensors:

> sudo /usr/lib/fm/fmd/fmtopo -e | grep /chassis0/fan
/chassis0/fan0
/chassis0/fan0
/chassis0/fan1
/chassis0/fan1
/chassis0/fan2
/chassis0/fan2
/chassis0/fan3
/chassis0/fan3
/chassis0/fan4
/chassis0/fan4
/chassis0/fan5
/chassis0/fan5

When you however use ipmitool it becomes obvious that there are really only six chassis fans:

> sudo ipmitool sensor | grep Fan
Fan1             | 6460.000   | RPM        | ok    | 340.000   | 408.000   | 476.000   | 17204.000 | 17272.000 | 17340.000
Fan2             | na         | RPM        | na    | 340.000   | 408.000   | 476.000   | 17204.000 | 17272.000 | 17340.000
Fan3             | na         | RPM        | na    | 340.000   | 408.000   | 476.000   | 17204.000 | 17272.000 | 17340.000
Fan4             | 6596.000   | RPM        | ok    | 340.000   | 408.000   | 476.000   | 17204.000 | 17272.000 | 17340.000
Fan5             | 6392.000   | RPM        | ok    | 340.000   | 408.000   | 476.000   | 17204.000 | 17272.000 | 17340.000
Fan6             | na         | RPM        | na    | 340.000   | 408.000   | 476.000   | 17204.000 | 17272.000 | 17340.000

The sensors Fan2, Fan3 and Fan6 are only placeholders for the 3U version of the Supermicro X9.

To avoid false sensor information or even false alerts the IPMI topo plugin should ignore such unavailable sensors.


Related issues

Related to illumos gate - Feature #8975: ipmi topo plugin should automatically enumerate sensors on nodes it enumeratesClosed2018-01-19

Actions

History

#1

Updated by Matthias Scheler 5 months ago

  • Related to Feature #8975: ipmi topo plugin should automatically enumerate sensors on nodes it enumerates added
#2

Updated by Matthias Scheler 5 months ago

  • Status changed from New to In Progress
#3

Updated by Matthias Scheler 5 months ago

Review can be found here: https://illumos.org/rb/r/2397/

#4

Updated by Matthias Scheler 4 months ago

  • Subject changed from IPMI topo plugin should skip unavailable sensors to IPMI topo plugin should read unavailable sensors
#5

Updated by Matthias Scheler 4 months ago

  • Subject changed from IPMI topo plugin should read unavailable sensors to IPMI topo plugin shouldn't return data from unavailable sensors
#6

Updated by Matthias Scheler 4 months ago

Testing Done:
1.) Recorded the output of "/usr/lib/fm/fmd/fmtopo -S" on a Supermicro X9 2U.
2.) Run full nightly, used "onu" to create new BE and booted into it.
3.) Recorded the new output of "/usr/lib/fm/fmd/fmtopo -S" on the same machine. The differences are as expected:

--- before.txt 2019-11-05 10:40:24.143900715 0000
++ after.txt 2019-11-05 11:24:05.281979308 +0000
@ -1,5 +1,5 @
TIME UUID
-Nov 05 10:40:23 591d6fa7-b2a0-ee13-ab35-a08b376cd8b2
+Nov 05 11:24:05 48db20e3-81b7-431c-d266-c3c20ecbf8fa

hc://:product-id=T3300-B1:server-id=zebi-46:chassis-id=TS1502-0076/chassis=0
Present: true
@ -31,7 +31,7 @

hc://:product-id=T3300-B1:server-id=zebi-46:chassis-id=TS1502-0076/chassis=0/fan=1
Present: true
- Unusable: false
+ Unusable: true

hc://:product-id=T3300-B1:server-id=zebi-46:chassis-id=TS1502-0076/chassis=0/fan=1?sensor=Fan2
Present: true
@ -39,7 +39,7 @

hc://:product-id=T3300-B1:server-id=zebi-46:chassis-id=TS1502-0076/chassis=0/fan=2
Present: true
- Unusable: false
+ Unusable: true

hc://:product-id=T3300-B1:server-id=zebi-46:chassis-id=TS1502-0076/chassis=0/fan=2?sensor=Fan3
Present: true
@ -63,7 +63,7 @

hc://:product-id=T3300-B1:server-id=zebi-46:chassis-id=TS1502-0076/chassis=0/fan=5
Present: true
- Unusable: false
+ Unusable: true

hc://:product-id=T3300-B1:server-id=zebi-46:chassis-id=TS1502-0076/chassis=0/fan=5?sensor=Fan6
Present: true

"ipmitool sensor" confirms that the correct sensors were marked as unusable:

Fan1 | 6256.000 | RPM | ok | 340.000 | 408.000 | 476.000 | 17204.000 | 17272.000 | 17340.000
Fan2 | na | RPM | na | 340.000 | 408.000 | 476.000 | 17204.000 | 17272.000 | 17340.000
Fan3 | na | RPM | na | 340.000 | 408.000 | 476.000 | 17204.000 | 17272.000 | 17340.000
Fan4 | 6324.000 | RPM | ok | 340.000 | 408.000 | 476.000 | 17204.000 | 17272.000 | 17340.000
Fan5 | 6188.000 | RPM | ok | 340.000 | 408.000 | 476.000 | 17204.000 | 17272.000 | 17340.000
Fan6 | na | RPM | na | 340.000 | 408.000 | 476.000 | 17204.000 | 17272.000 | 17340.000

#7

Updated by Matthias Scheler 4 months ago

The code review can be found here: https://illumos.org/rb/r/2397/

#8

Updated by Matthias Scheler 4 months ago

I also checked the code for memory leaks by executing the following comands ...

export UMEM_DEBUG=default
export TOPONODLCLOSE=1
/usr/lib/fm/fmd/fmtopo -C -S

... and checking the resulting crash dump (forced by "-C" option):

# mdb core.fmtopo.10146
Loading modules: [ libc.so.1 libtopo.so.1 libumem.so.1 libnvpair.so.1 libuutil.so.1 libavl.so.1 libsysevent.so.1 ld.so.1 ]
> ::findleaks
findleaks: no memory leaks detected
>
#9

Updated by Electric Monk 4 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

git commit 985366be3b9cf10a0fd786cf0d9c1a6558c2b596

commit  985366be3b9cf10a0fd786cf0d9c1a6558c2b596
Author: Matthias Scheler <matthias.scheler@wdc.com>
Date:   2019-11-09T23:56:18.000Z

    11818 IPMI topo plugin shouldn't return data from unavailable sensors
    Reviewed by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org>
    Reviewed by: Paul Winder <paul@winders.demon.co.uk>
    Approved by: Dan McDonald <danmcd@joyent.com>

Also available in: Atom PDF