Feature #9587
closedAdd test mechanism to sensor-transport module for spoofing sensor states
100%
Description
To make it easier to test and verify the code for detecting and handling sensor failures, there should be a way to spoof the state for any sensor in topo. The most practical way to implement this would be to add support for an optional configuration parameter in sensor-transport.conf where one could specify one or more sensors and the state value(s) that sensor-transport should pretend to have read from them.
Note that this change has already been integrated into illumos-joyent via the commit below:
commit 3843bb9b187919e79faf125f8ef4d7979a130486 Author: Rob Johnston <rob.johnston@joyent.com> Date: Thu Mar 1 00:49:05 2018 +0000 OS-6513 Add platform-specific topo maps for the Joyent J330x Compute Platform OS-6657 Add test mechanism to sensor-transport module for spoofing sensor states OS-6710 need to handle SP's that present multiple sensors with the same entity name Reviewed by: Robert Mustacchi <rm@joyent.com> Approved by: Joshua M. Clulow <jmc@joyent.com>
Updated by Rob Johnston almost 4 years ago
Testing¶
I created a platform image with this change and then exercised the change by loopback mounting various versions of sensor-transport.conf to verify that the new "spoof_sensor_state" could be parsed correctly and that the intended effect was achieved.
scenario 1 - spoof the state for a single sensor:
# mount -F lofs /var/tmp/sensor-transport.conf /usr/lib/fm/fmd/plugins/sensor-transport.conf # cat /usr/lib/fm/fmd/plugins/sensor-transport.conf setprop spoof_sensor_state "*psu=1:PS2 Status:0x2" # fmadm reset sensor-transport fmadm: sensor-transport module has been reset [wait at least 60 seconds...] # fmadm faulty --------------- ------------------------------------ -------------- --------- TIME EVENT-ID MSG-ID SEVERITY --------------- ------------------------------------ -------------- --------- Apr 03 22:40:26 ada74c04-053e-6739-f113-fc14bfaee142 SENSOR-8000-6G Major Host : magma Platform : Joyent-Storage-Platform-7001 Chassis_id : S247158X6A07720 Product_sn : Fault class : fault.psu.failed-int Problem in : "PSU 1" (hc://:product-id=Joyent-Storage-Platform-7001:server-id=magma:chassis-id=S247158X6A07720/chassis=0/psu=1) faulted but still in service FRU : "PSU 1" (hc://:product-id=Joyent-Storage-Platform-7001:server-id=magma:chassis-id=S247158X6A07720/chassis=0/psu=1) faulty Description : A sensor indicates that this power supply has failed. Refer to http://illumos.org/msg/SENSOR-8000-6G for more information. Response : None. Impact : The enclosure may be getting inadequate power. Subsequent loss of power supplies may force the enclosure to shutdown. Action : Replace the indicated power supply
scenario 2 - spoof the state for multiple sensors
# mount -F lofs /var/tmp/sensor-transport.conf /usr/lib/fm/fmd/plugins/sensor-transport.conf # cat /usr/lib/fm/fmd/plugins/sensor-transport.conf setprop spoof_sensor_state "*psu=1:PS2 Status:0x2;*psu=0:PS1 Status:0x2" ]# fmadm reset sensor-transport fmadm: sensor-transport module has been reset # [wait at least 60 seconds...] # fmadm faulty --------------- ------------------------------------ -------------- --------- TIME EVENT-ID MSG-ID SEVERITY --------------- ------------------------------------ -------------- --------- Apr 03 23:37:59 004e77a1-1d04-6e24-aa1c-a8ed6323c9a9 SENSOR-8000-6G Major Host : magma Platform : Joyent-Storage-Platform-7001 Chassis_id : S247158X6A07720 Product_sn : Fault class : fault.psu.failed-int Problem in : "PSU 0" (hc://:product-id=Joyent-Storage-Platform-7001:server-id=magma:chassis-id=S247158X6A07720/chassis=0/psu=0) faulted but still in service FRU : "PSU 0" (hc://:product-id=Joyent-Storage-Platform-7001:server-id=magma:chassis-id=S247158X6A07720/chassis=0/psu=0) faulty Description : A sensor indicates that this power supply has failed. Refer to http://illumos.org/msg/SENSOR-8000-6G for more information. Response : None. Impact : The enclosure may be getting inadequate power. Subsequent loss of power supplies may force the enclosure to shutdown. Action : Replace the indicated power supply --------------- ------------------------------------ -------------- --------- TIME EVENT-ID MSG-ID SEVERITY --------------- ------------------------------------ -------------- --------- Apr 03 23:37:59 81068e9e-9bca-6d82-ac86-c0dfc8c4a0cd SENSOR-8000-6G Major Host : magma Platform : Joyent-Storage-Platform-7001 Chassis_id : S247158X6A07720 Product_sn : Fault class : fault.psu.failed-int Problem in : "PSU 1" (hc://:product-id=Joyent-Storage-Platform-7001:server-id=magma:chassis-id=S247158X6A07720/chassis=0/psu=1) faulted but still in service FRU : "PSU 1" (hc://:product-id=Joyent-Storage-Platform-7001:server-id=magma:chassis-id=S247158X6A07720/chassis=0/psu=1) faulty Description : A sensor indicates that this power supply has failed. Refer to http://illumos.org/msg/SENSOR-8000-6G for more information. Response : None. Impact : The enclosure may be getting inadequate power. Subsequent loss of power supplies may force the enclosure to shutdown. Action : Replace the indicated power supply
I also verified that if the value of spoof_sensor_state is malformed, then an ereport is generated and a defect is diagnosed against the fmd module.
Updated by Electric Monk almost 4 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit ea30102ce458697473b0435bcdc7647dce2551f4
commit ea30102ce458697473b0435bcdc7647dce2551f4 Author: Rob Johnston <rob.johnston@joyent.com> Date: 2018-06-28T22:44:13.000Z 9586 need to handle SP's that present multiple sensors with the same entity name 9587 Add test mechanism to sensor-transport module for spoofing sensor states Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: Igor Kozhukhov <igor@dilos.org> Approved by: Richard Lowe <richlowe@richlowe.net>