Feature #9459
Implement topo module to enumerate dimms from smbios
100%
Description
On many newer Intel platforms and all recent AMD platforms, the dimms are not represented at all in the hc-scheme topo snapshot.
In a ideal world we would implement memory controller drivers for all the recent Intel and AMD chip generations so that we could fully enumerate the memory topology (controllers. channels, ranks, dimms). This would allow us to associate what's in topology with MCA events and then we could write Eversholt rules to diagnose the error telemetry.
We may still do this for new Intel platforms, but trying to provide such support for all Intel and AMD platforms is likely not practical.
As a first step, it would be nice to just simply get the DIMMs into topology so that we could generate a more complete physical inventory of the system from the topo snapshot. It would also provide a place to hang the DIMM-related sensors off of.
Information on the DIMM slots and installed DIMMs is available in SMBIOS. This CR is to develop a topo module that will enumerate slot and dimm nodes, as children of the motherboard node, from the SMB_TYPE_MEMDEVICE records. This code would first check for the existence of a functional memory controller driver (e.g intel_snb). If one is found, the module will bow out gracefully. The module will also step aside on SPARC platforms. Otherwise, it will perform DIMM enumeration from SMBIOS.
Note, that this change has already been integrated into illumos-joyent, as part of the large commit below:
https://github.com/joyent/illumos-joyent/commit/4a99ae161887bed6eed6dcb1699f188f023921a2
Updated by Rob Johnston over 2 years ago
Testing notes are in the original illumos-joyent bug report:
https://smartos.org/bugview/OS-6490
Additionally, I onu'd an openindiana workstation with these changes and verified that the dimms were correctly enumerated in the topo snapshot - see output below:
root@openindiana:~# uname -a SunOS openindiana 5.11 master-0-g51b0315b1b i86pc i386 i86pc root@openindiana:~# /usr/lib/fm/fmd/fmtopo -V "*dimm*" TIME UUID Sep 04 18:45:02 daa1cc58-37dd-6c4e-f152-859c2185c829 hc://:product-id=System-Product-Name:server-id=openindiana:chassis-id=System-Serial-Number:serial=00000000:part=CMU16GX4M2A2400C16/motherboard=0/slot=0/dimm=0 group: protocol version: 1 stability: Private/Private resource fmri hc://:product-id=System-Product-Name:server-id=openindiana:chassis-id=System-Serial-Number:serial=00000000:part=CMU16GX4M2A2400C16/motherboard=0/slot=0/dimm=0 FRU fmri hc://:product-id=System-Product-Name:server-id=openindiana:chassis-id=System-Serial-Number:serial=00000000:part=CMU16GX4M2A2400C16/motherboard=0/slot=0/dimm=0 label string ChannelA-DIMM1 group: authority version: 1 stability: Private/Private product-id string System-Product-Name chassis-id string System-Serial-Number server-id string openindiana group: dimm-properties version: 1 stability: Private/Private size uint64 0x200000000 type string DDR4 rank uint32 0x1 configured-speed uint32 0x855 maximum-speed uint32 0x855 configured-voltage double 1.000000 manufacturer string Corsair asset-tag string 9876543210 location string ChannelA-DIMM1 hc://:product-id=System-Product-Name:server-id=openindiana:chassis-id=System-Serial-Number:serial=00000000:part=3400-C16-Series/motherboard=0/slot=1/dimm=0 group: protocol version: 1 stability: Private/Private resource fmri hc://:product-id=System-Product-Name:server-id=openindiana:chassis-id=System-Serial-Number:serial=00000000:part=3400-C16-Series/motherboard=0/slot=1/dimm=0 FRU fmri hc://:product-id=System-Product-Name:server-id=openindiana:chassis-id=System-Serial-Number:serial=00000000:part=3400-C16-Series/motherboard=0/slot=1/dimm=0 label string ChannelA-DIMM2 group: authority version: 1 stability: Private/Private product-id string System-Product-Name chassis-id string System-Serial-Number server-id string openindiana group: dimm-properties version: 1 stability: Private/Private size uint64 0x200000000 type string DDR4 rank uint32 0x1 configured-speed uint32 0x855 maximum-speed uint32 0x855 configured-voltage double 1.000000 manufacturer string Patriot asset-tag string 9876543210 location string ChannelA-DIMM2 hc://:product-id=System-Product-Name:server-id=openindiana:chassis-id=System-Serial-Number:serial=00000000:part=CMU16GX4M2A2400C16/motherboard=0/slot=2/dimm=0 group: protocol version: 1 stability: Private/Private resource fmri hc://:product-id=System-Product-Name:server-id=openindiana:chassis-id=System-Serial-Number:serial=00000000:part=CMU16GX4M2A2400C16/motherboard=0/slot=2/dimm=0 FRU fmri hc://:product-id=System-Product-Name:server-id=openindiana:chassis-id=System-Serial-Number:serial=00000000:part=CMU16GX4M2A2400C16/motherboard=0/slot=2/dimm=0 label string ChannelB-DIMM1 group: authority version: 1 stability: Private/Private product-id string System-Product-Name chassis-id string System-Serial-Number server-id string openindiana group: dimm-properties version: 1 stability: Private/Private size uint64 0x200000000 type string DDR4 rank uint32 0x1 configured-speed uint32 0x855 maximum-speed uint32 0x855 configured-voltage double 1.000000 manufacturer string Corsair asset-tag string 9876543210 location string ChannelB-DIMM1 hc://:product-id=System-Product-Name:server-id=openindiana:chassis-id=System-Serial-Number:serial=00000000:part=3400-C16-Series/motherboard=0/slot=3/dimm=0 group: protocol version: 1 stability: Private/Private resource fmri hc://:product-id=System-Product-Name:server-id=openindiana:chassis-id=System-Serial-Number:serial=00000000:part=3400-C16-Series/motherboard=0/slot=3/dimm=0 FRU fmri hc://:product-id=System-Product-Name:server-id=openindiana:chassis-id=System-Serial-Number:serial=00000000:part=3400-C16-Series/motherboard=0/slot=3/dimm=0 label string ChannelB-DIMM2 group: authority version: 1 stability: Private/Private product-id string System-Product-Name chassis-id string System-Serial-Number server-id string openindiana group: dimm-properties version: 1 stability: Private/Private size uint64 0x200000000 type string DDR4 rank uint32 0x1 configured-speed uint32 0x855 maximum-speed uint32 0x855 configured-voltage double 1.000000 manufacturer string Patriot asset-tag string 9876543210 location string ChannelB-DIMM2 root@openindiana:~# smbios -t SMB_TYPE_MEMDEVICE ID SIZE TYPE 67 110 SMB_TYPE_MEMDEVICE (type 17) (memory device) Manufacturer: Corsair Serial Number: 00000000 Asset Tag: 9876543210 Location Tag: ChannelA-DIMM1 Part Number: CMU16GX4M2A2400C16 Physical Memory Array: 66 Memory Error Data: Not Supported Total Width: 64 bits Data Width: 64 bits Size: 8589934592 bytes Form Factor: 9 (DIMM) Set: None Rank: 1 (single) Memory Type: 26 (DDR4) Flags: 0x4080 SMB_MDF_SYNC (synchronous) SMB_MDF_UNREG (Unregistered (Unbuffered)) Speed: 2133 MT/s Configured Speed: 2133 MT/s Device Locator: ChannelA-DIMM1 Bank Locator: BANK 0 Minimum Voltage: Unknown Maximum Voltage: Unknown Configured Voltage: 1.20V ID SIZE TYPE 68 110 SMB_TYPE_MEMDEVICE (type 17) (memory device) Manufacturer: Patriot Serial Number: 00000000 Asset Tag: 9876543210 Location Tag: ChannelA-DIMM2 Part Number: 3400 C16 Series Physical Memory Array: 66 Memory Error Data: Not Supported Total Width: 64 bits Data Width: 64 bits Size: 8589934592 bytes Form Factor: 9 (DIMM) Set: None Rank: 1 (single) Memory Type: 26 (DDR4) Flags: 0x4080 SMB_MDF_SYNC (synchronous) SMB_MDF_UNREG (Unregistered (Unbuffered)) Speed: 2133 MT/s Configured Speed: 2133 MT/s Device Locator: ChannelA-DIMM2 Bank Locator: BANK 1 Minimum Voltage: Unknown Maximum Voltage: Unknown Configured Voltage: 1.20V ID SIZE TYPE 69 110 SMB_TYPE_MEMDEVICE (type 17) (memory device) Manufacturer: Corsair Serial Number: 00000000 Asset Tag: 9876543210 Location Tag: ChannelB-DIMM1 Part Number: CMU16GX4M2A2400C16 Physical Memory Array: 66 Memory Error Data: Not Supported Total Width: 64 bits Data Width: 64 bits Size: 8589934592 bytes Form Factor: 9 (DIMM) Set: None Rank: 1 (single) Memory Type: 26 (DDR4) Flags: 0x4080 SMB_MDF_SYNC (synchronous) SMB_MDF_UNREG (Unregistered (Unbuffered)) Speed: 2133 MT/s Configured Speed: 2133 MT/s Device Locator: ChannelB-DIMM1 Bank Locator: BANK 2 Minimum Voltage: Unknown Maximum Voltage: Unknown Configured Voltage: 1.20V ID SIZE TYPE 70 110 SMB_TYPE_MEMDEVICE (type 17) (memory device) Manufacturer: Patriot Serial Number: 00000000 Asset Tag: 9876543210 Location Tag: ChannelB-DIMM2 Part Number: 3400 C16 Series Physical Memory Array: 66 Memory Error Data: Not Supported Total Width: 64 bits Data Width: 64 bits Size: 8589934592 bytes Form Factor: 9 (DIMM) Set: None Rank: 1 (single) Memory Type: 26 (DDR4) Flags: 0x4080 SMB_MDF_SYNC (synchronous) SMB_MDF_UNREG (Unregistered (Unbuffered)) Speed: 2133 MT/s Configured Speed: 2133 MT/s Device Locator: ChannelB-DIMM2 Bank Locator: BANK 3 Minimum Voltage: Unknown Maximum Voltage: Unknown Configured Voltage: 1.20V
Updated by Rob Johnston over 2 years ago
Note that this port from illumos-joyent to illumos-gate also incorporates the following followup push to illumos-joyent, to correct a build failure that Igor saw on dilos,
https://github.com/joyent/illumos-joyent/commit/899e8e86192afce363b05b30f79590eef12b9724
Updated by Electric Monk over 2 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit 6d65bee7bcc62b2d9bdfde6610561ce76c92a908
commit 6d65bee7bcc62b2d9bdfde6610561ce76c92a908 Author: Rob Johnston <rob.johnston@joyent.com> Date: 2018-09-06T16:52:36.000Z 9459 Implement topo module to enumerate dimms from smbios Reviewed by: Yuri Pankov <yuripv@yuripv.net> Reviewed by: Igor Kozhukhov <igor@dilos.org> Approved by: Richard Lowe <richlowe@richlowe.net>