Project

General

Profile

Actions

Bug #9586

closed

need to handle SP's that present multiple sensors with the same entity name

Added by Rob Johnston over 3 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Start date:
2018-06-07
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

Most of the property methods in the fac_prov_ipmi module assume that IPMI entity names are unique - i.e. no two records in the sensor data repository will have the same entity name. And up to now this has proved to be a safe assumption. However, the IPMI specification makes no such guarantees and it turns out the that Dell R730 (aka JCP 330x) uses the same entity name for many sensors. For example, there are several physical presence sensors that are all named "Presence".

To handle this case, the code which looks up sensors by name needs to be augmented to also verify that the entity ID and instance matches the sensor that we're looking for. I think we can achieve this via the following changes:

1) Add the entity ID and instance as topo properties to the hardware nodes with child sensors facility nodes. For fans and psus we can modify the ipmi enumerator to add this. For other components, we'll probably need to add it statically in the topo maps on an as-needed basis - as in we currently just need this for JCP 330x,

2) Add a new lookup interface to libipmi that allows you to lookup a record based on a combination of the entity name, id and instance.

3) Change the implementation of ipmi_sensor_state() and ipmi_sensor_reading() to grab the entity id and instance from the parent node and then lookup the sensor using the new libipmi lookup interface.

Note that this change has already been integrated into illumos-joyent via the following commit:

commit 3843bb9b187919e79faf125f8ef4d7979a130486
Author: Rob Johnston <rob.johnston@joyent.com>
Date:   Thu Mar 1 00:49:05 2018 +0000

    OS-6513 Add platform-specific topo maps for the Joyent J330x Compute Platform
    OS-6657 Add test mechanism to sensor-transport module for spoofing sensor states
    OS-6710 need to handle SP's that present multiple sensors with the same entity name
    Reviewed by: Robert Mustacchi <rm@joyent.com>
    Approved by: Joshua M. Clulow <jmc@joyent.com>
Actions #1

Updated by Rob Johnston over 3 years ago

Testing

I created a platform image with that included the fix for this issue and booted it on the system were the bug was seen and verified that that for sensor entities that shared the same name, they were associated with the correct FRU. For example, on the system were the bug was originally observed, there are seven sensors called "Presence". Five of them are fan presence sensors and two of them are PSU presence sensors. I verified that with the change each "Presence" sensor was enumerated under the correct fan or PSU.

Additionally, I wrote the following small utility that exercises ipmi_sdr_lookup_precise(), which is the new libipmi interface that implements the bulk of this fix:

https://github.com/joyent/smartos-dev-tools/tree/master/ipmi/read-sensor

Actions #2

Updated by Electric Monk over 3 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit ea30102ce458697473b0435bcdc7647dce2551f4

commit  ea30102ce458697473b0435bcdc7647dce2551f4
Author: Rob Johnston <rob.johnston@joyent.com>
Date:   2018-06-28T22:44:13.000Z

    9586 need to handle SP's that present multiple sensors with the same entity name
    9587 Add test mechanism to sensor-transport module for spoofing sensor states
    Reviewed by: Toomas Soome <tsoome@me.com>
    Reviewed by: Igor Kozhukhov <igor@dilos.org>
    Approved by: Richard Lowe <richlowe@richlowe.net>

Actions

Also available in: Atom PDF