fmadm faulty NULL pointer dereference
We ran "fmadm faulty" on a customer system, and it died with a segmentation fault after printing a single fault header. It died in print_sup_record() when dereferencing status_record_t`host, which was NULL. The "suspect" nvlists obtained from the fmadm core revealed that the host id is parsed from the FMRI "authority", which is expected to be in the "de" sub-list of the suspect list. If everything is as expected, this is what it looks like:
> 0x81df248::nvlist version=00 class='list.suspect' uuid='aef06f82-fe7e-4bc3-cac5-ec1e9562d9ee' code='PCIEX-8000-0A' diag-time=000000005c4f38dd.0000000000084e48 de version=00 scheme='fmd' authority version=00 product-id='PowerEdge-R740xd' chassis-id='FL991T2' server-id='GRRSDCDN001' mod-name='eft' mod-version='1.16' fault-list-sz=00000001 fault-list ...
In the case where no host id was generated, the "de" sublist was empty:
ersion=00 class='list.suspect' uuid='af168faf-46a0-cdb0-e82b-ed38f470f7f8' code='PCIEX-8000-0A' diag-time=000000005deaa1c5.000000000004139f de fault-list-sz=00000005 fault-list ...
It would be interesting to know why "de" is empty, but in any case we have to conclude that fmadm cannot blindly rely on a host id being present.
Updated by Hans Rosenfeld about 2 years ago
Testing: none besides building the code and running "fmadm faulty" to avoid regressions. I'm unable to reproduce the circumstances that caused the issue, but then I think this change is pretty obvious.
Updated by Electric Monk about 2 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
commit 7adb68a6af9135eabca6203d488597cb40c4675c Author: Hans Rosenfeld <firstname.lastname@example.org> Date: 2020-03-25T16:04:43.000Z 12196 fmadm faulty NULL pointer dereference Reviewed by: Toomas Soome <email@example.com> Reviewed by: Andrew Stormont <firstname.lastname@example.org> Approved by: Dan McDonald <email@example.com>