Project

General

Profile

Actions

Feature #11609

closed

Want modern Intel IMC driver

Added by Robert Mustacchi about 4 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Category:
driver - device drivers
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:
External Bug:

Description

With subsequent generations of hardware, Intel has new PCI IDs and different aspects to the memory controller. It'd be good if we could support those for the following hardware generations that are in the fleet:

  • Sandy Bridge
  • Haswell
  • Skylake

Note that Ivy Bridge / Broadwell have minor changes from their main processors and should be rather small changes to support.

We should probably have a generic Intel IMC driver that can change based on the generation rather than having to write one from scratch per generation. But whether or not that will make sense will ultimately depend on the details of each generation and how much duplication there is.


To test this, I've booted this on a variety of platforms, making sure to cover one of each of the different generations of hardware. Specifically this means:

  • Sandy Bridge EP
  • Ivy Bridge EP
  • Haswell EP
  • Broadwell EP
  • Skylake Gold
  • Cascade Lake Gold/Platinum

For each of the above platforms, I made sure of the following:

  • That we successfully attached the imc driver and found all of the stubs
  • That we successfully had DIMM topology information in fmtopo that mirrored what physically existed
  • That we could manually use the mcdecode utility on supported configurations
  • That we could dump the memory controller information to a file and then use mcdecode to decode it on another platform

Unfortunately, it is very hard to exercise the MCE path for these changes so the decoding ioctl is the best that I could do.

I further tested this by the full suite of decoding unit tests. These were ran as part of both the os_tests test suite which all passed and on their own. These provide pretty good coverage of the actual decoding parts (assuming one is reading the data correctly from hardware).

Finally, I also tested these changes on a few platforms where these wouldn't come up at all. This includes the following:

  • Haswell 1s
  • AMD EPYC 2s
  • Intel Kaby Lake NUC
  • Intel Coffee Lake platform

On all of them, the older SMBIOS based method for DIMMs still came through and everything else seemed to boot and at least pass a sanity check. The imcstub driver was not attached anywhere.


Files

prtconf-dD-before.txt (4.04 KB) prtconf-dD-before.txt Denis Kozadaev, 2019-09-09 06:21 PM
prtconf-dD-after.txt (4.8 KB) prtconf-dD-after.txt Denis Kozadaev, 2019-09-09 06:21 PM
prtconf-v.txt.bz2 (11.5 KB) prtconf-v.txt.bz2 Denis Kozadaev, 2019-09-10 09:15 AM

Related issues

Related to illumos gate - Feature #12444: Intel v1 chip topo needs rank informationClosedRobert Mustacchi

Actions
Actions #1

Updated by Denis Kozadaev about 4 years ago

libtopo does not work on my lenovo laptop after this changes
debug and non-debug builds

prtdiag, diskinfo, fmtopo: they all crash with the same errors:

root@lenovo:~# svcs fmd
STATE          STIME    FMRI
maintenance    16:17:12 svc:/system/fmd:default

root@lenovo:~# prtdiag 
System Configuration: LENOVO INVALID
BIOS Configuration: LENOVO 39CN16WW        07/29/2010
Assertion failed: comma != NULL, file ../../common/pcibus/did_props.c, line 421, function dev_for_hostbridge
Abort

if I build the system without this changes, this tools work:

root@lenovo:~# svcs fmd
STATE          STIME    FMRI
online         16:28:53 svc:/system/fmd:default

root@lenovo:~# prtdiag 
System Configuration: LENOVO INVALID
BIOS Configuration: LENOVO 39CN16WW        07/29/2010

==== Processor Sockets ====================================

Version                          Location Tag
-------------------------------- --------------------------
Genuine Intel(R) CPU             CPU 1

==== Memory Device Sockets ================================

Type        Status Set Device Locator      Bank Locator
----------- ------ --- ------------------- ----------------
DDR3        in use 1   M1                  Bank 0
DDR3        in use 1   M2                  Bank 1
DDR3        empty  1   M3                  Bank 2
DDR3        empty  1   M4                  Bank 3

==== On-Board Devices =====================================
IGD
GigaLAN

==== Upgradeable Slots ====================================

ID  Status    Type             Description
--- --------- ---------------- ----------------------------
6   available PCI Express x16  PEG Slot J5C1 for *Field processor
6   available PCI Express x16  PEG Slot J5C1 for *dale processor
7   in use    PCI Express x1   PCI Express Slot J6C2, Qualcomm Atheros AR8131 Gigabit Ethernet (atge)
8   in use    PCI Express x1   PCI Express Slot J6D2, Broadcom Limited BCM4313 802.11bgn Wireless Network Adapter (<unknown>)
9   available PCI Express x1   PCI Express Slot J7C1
9   available PCI Express x1   PCI Express Slot J7D2
11  available PCI Express x1   PCI Express Slot J6C1
12  available PCI Express x16  PCI Express Slot J8C2

Actions #2

Updated by Denis Kozadaev about 4 years ago

the hardware is: intel core-i3
Manufacturer: LENOVO
Version: Lenovo B560
Family: IDEAPAD

Actions #3

Updated by Robert Mustacchi about 4 years ago

I believe the issue is related to 11612, not this. What'll be useful is to understand what PCI devices weren't being correctly enumerated before this and now are.

Actions #4

Updated by Denis Kozadaev about 4 years ago

before:

[1]> pci_bios_maxbus/D
pci_bios_maxbus:
pci_bios_maxbus:15
[3]> 0::rdpcicfg 0xff 0t19 0
ffffffff

after:

[1] pci_bios_maxbus/D
pci_bios_maxbus:
pci_bios_maxbus:255
[3]> 0::rdpcicfg 0xff 0t19 0
ffffffff

Actions #5

Updated by Denis Kozadaev about 4 years ago

the core files are too big (compressed version is bigger than 4MB)
I placed prtdiag.core and diskinfo.core here:
http://witch.tambov.ru/~denis/lenovo/diskinfo.core
http://witch.tambov.ru/~denis/lenovo/prtdiag.core

and compressed version:
http://witch.tambov.ru/~denis/lenovo/diskinfo.core.bz2
http://witch.tambov.ru/~denis/lenovo/prtdiag.core.bz2

root@lenovo:~# mdb prtdiag.core
Loading modules: [ libc.so.1 libtopo.so.1 libumem.so.1 libnvpair.so.1 libuutil.so.1 libavl.so.1 ld.so.1 ]
> $C
fffffc7fffdfe720 libc.so.1`_lwp_kill+0xa()
fffffc7fffdfe750 libc.so.1`raise+0x1e(6)
fffffc7fffdfe7a0 libc.so.1`abort+0x88()
fffffc7fffdfea10 0xfffffc7fef21c249()
fffffc7fffdfea60 hostbridge.so`dev_for_hostbridge+0xef(5d0280, 581a88)
fffffc7fffdfeaf0 hostbridge.so`DEVprop_set+0x13b(65ddd0, 65de70, 0, fffffc7feb91c9da, fffffc7feb91c813)
fffffc7fffdfeb70 hostbridge.so`did_props_set+0x84(65ddd0, 65de70, fffffc7feb92dc60, 7)
fffffc7fffdfebd0 hostbridge.so`pcihostbridge_declare+0x87(5d0280, 57e3e0, 5a1a08, 2)
fffffc7fffdfec40 hostbridge.so`hb_process+0x69(5d0280, 57e3e0, 2, 5a1a08)
fffffc7fffdfeca0 hostbridge.so`pci_hostbridges_find+0x6c(5d0280, 57e3e0)
fffffc7fffdfece0 hostbridge.so`platform_hb_enum+0x13(5d0280, 57e3e0, 5cb7e0, 0, fe)
fffffc7fffdfed60 hostbridge.so`hb_enum+0x90(5d0280, 57e3e0, 5cb7e0, 0, fe, 0)
fffffc7fffdfedf0 libtopo.so.1`topo_mod_enumerate+0xc8(5d0280, 57e3e0, 466260, 5cb7e0, 0, fe)
fffffc7fffdfee40 libtopo.so.1`enum_run+0x94(57e480, 656f60)
fffffc7fffdfeeb0 libtopo.so.1`topo_xml_range_process+0x103(57e480, 468340, 656f60)
fffffc7fffdfef30 libtopo.so.1`tf_rdata_new+0x118(57e480, 586e60, 468340, 57e3e0)
fffffc7fffdfefc0 libtopo.so.1`topo_xml_walk+0x10d(57e480, 586e60, 466f60, 57e3e0)
fffffc7fffdff050 libtopo.so.1`dependent_create+0x15c(57e480, 586e60, 586bc0, 466f60, 57e3e0)
fffffc7fffdff0d0 libtopo.so.1`dependents_create+0x86(57e480, 586e60, 586bc0, 4611d0, 57e3e0)
fffffc7fffdff1f0 libtopo.so.1`pad_process+0x9a(57e480, 589f60, 4611d0, 57e3e0, 589fa8)
fffffc7fffdff260 libtopo.so.1`topo_xml_range_process+0x25b(57e480, 4611d0, 589f60)
fffffc7fffdff2e0 libtopo.so.1`tf_rdata_new+0x118(57e480, 586e60, 4611d0, 57e520)
fffffc7fffdff370 libtopo.so.1`topo_xml_walk+0x10d(57e480, 586e60, 45ba60, 57e520)
fffffc7fffdff3c0 libtopo.so.1`topo_xml_enum+0x5f(57e480, 586e60, 57e520)
fffffc7fffdff540 libtopo.so.1`topo_file_load+0xe6(57e480, 57e520, fffffc7fec571de7, fffffc7fec571dc1, 0)
fffffc7fffdff570 libtopo.so.1`topo_mod_enummap+0x10(57e480, 57e520, fffffc7fec571de7, fffffc7fec571dc1)
fffffc7fffdff5b0 x86pi.so`x86pi_enum_start+0x1f2(57e480, fffffc7fffdff5c0)
fffffc7fffdff630 x86pi.so`x86pi_enum+0x6f(57e480, 57e520, 581e90, 0, 0, 0)
fffffc7fffdff6c0 libtopo.so.1`topo_mod_enumerate+0xc8(57e480, 57e520, 417850, 581e90, 0, 0)
fffffc7fffdff710 libtopo.so.1`enum_run+0x94(57e5c0, 5780f0)
fffffc7fffdff780 libtopo.so.1`topo_xml_range_process+0x103(57e5c0, 58eeb0, 5780f0)
fffffc7fffdff800 libtopo.so.1`tf_rdata_new+0x118(57e5c0, 586ee0, 58eeb0, 57e520)
fffffc7fffdff890 libtopo.so.1`topo_xml_walk+0x10d(57e5c0, 586ee0, 58eae0, 57e520)
fffffc7fffdff8e0 libtopo.so.1`topo_xml_enum+0x5f(57e5c0, 586ee0, 57e520)
fffffc7fffdffa60 libtopo.so.1`topo_file_load+0xe6(57e5c0, 57e520, 44afb8, 581ea0, 0)
fffffc7fffdffa90 libtopo.so.1`topo_tree_enum+0x84(446f00, 56d300)
fffffc7fffdffad0 libtopo.so.1`topo_tree_enum_all+0x33(446f00)
fffffc7fffdffb30 libtopo.so.1`topo_snap_create+0xb2(446f00, fffffc7fffdffb9c, 0)
fffffc7fffdffb80 libtopo.so.1`topo_snap_hold+0xba(446f00, 0, fffffc7fffdffb9c)
fffffc7fffdffcb0 do_prominfo+0x18d(0, fffffc7fffdfff18, 0, 1)
fffffc7fffdffd00 main+0xde(1, fffffc7fffdffd58)
fffffc7fffdffd30 _start_crt+0x83()
fffffc7fffdffd40 _start+0x18()
root@lenovo:~# mdb diskinfo.core 
Loading modules: [ libc.so.1 libsysevent.so.1 libnvpair.so.1 libtopo.so.1 libumem.so.1 libuutil.so.1 libavl.so.1 ld.so.1 ]
> $C
fffffc7fffdfe300 libc.so.1`_lwp_kill+0xa()
fffffc7fffdfe330 libc.so.1`raise+0x1e(6)
fffffc7fffdfe380 libc.so.1`abort+0x88()
fffffc7fffdfe5f0 0xfffffc7fef21c249()
fffffc7fffdfe640 hostbridge.so`dev_for_hostbridge+0xef(5d7280, 588a88)
fffffc7fffdfe6d0 hostbridge.so`DEVprop_set+0x13b(664dd0, 664e70, 0, fffffc7feb91c9da, fffffc7feb91c813)
fffffc7fffdfe750 hostbridge.so`did_props_set+0x84(664dd0, 664e70, fffffc7feb92dc60, 7)
fffffc7fffdfe7b0 hostbridge.so`pcihostbridge_declare+0x87(5d7280, 5853e0, 5a8a08, 2)
fffffc7fffdfe820 hostbridge.so`hb_process+0x69(5d7280, 5853e0, 2, 5a8a08)
fffffc7fffdfe880 hostbridge.so`pci_hostbridges_find+0x6c(5d7280, 5853e0)
fffffc7fffdfe8c0 hostbridge.so`platform_hb_enum+0x13(5d7280, 5853e0, 5d27e0, 0, fe)
fffffc7fffdfe940 hostbridge.so`hb_enum+0x90(5d7280, 5853e0, 5d27e0, 0, fe, 0)
fffffc7fffdfe9d0 libtopo.so.1`topo_mod_enumerate+0xc8(5d7280, 5853e0, 450ff0, 5d27e0, 0, fe)
fffffc7fffdfea20 libtopo.so.1`enum_run+0x94(585480, 65df60)
fffffc7fffdfea90 libtopo.so.1`topo_xml_range_process+0x103(585480, 42fe50, 65df60)
fffffc7fffdfeb10 libtopo.so.1`tf_rdata_new+0x118(585480, 58de60, 42fe50, 5853e0)
fffffc7fffdfeba0 libtopo.so.1`topo_xml_walk+0x10d(585480, 58de60, 42ea70, 5853e0)
fffffc7fffdfec30 libtopo.so.1`dependent_create+0x15c(585480, 58de60, 58dbc0, 42ea70, 5853e0)
fffffc7fffdfecb0 libtopo.so.1`dependents_create+0x86(585480, 58de60, 58dbc0, 429cf0, 5853e0)
fffffc7fffdfedd0 libtopo.so.1`pad_process+0x9a(585480, 590f60, 429cf0, 5853e0, 590fa8)
fffffc7fffdfee40 libtopo.so.1`topo_xml_range_process+0x25b(585480, 429cf0, 590f60)
fffffc7fffdfeec0 libtopo.so.1`tf_rdata_new+0x118(585480, 58de60, 429cf0, 585520)
fffffc7fffdfef50 libtopo.so.1`topo_xml_walk+0x10d(585480, 58de60, 4241b0, 585520)
fffffc7fffdfefa0 libtopo.so.1`topo_xml_enum+0x5f(585480, 58de60, 585520)
fffffc7fffdff120 libtopo.so.1`topo_file_load+0xe6(585480, 585520, fffffc7fec571de7, fffffc7fec571dc1, 0)
fffffc7fffdff150 libtopo.so.1`topo_mod_enummap+0x10(585480, 585520, fffffc7fec571de7, fffffc7fec571dc1)
fffffc7fffdff190 x86pi.so`x86pi_enum_start+0x1f2(585480, fffffc7fffdff1a0)
fffffc7fffdff210 x86pi.so`x86pi_enum+0x6f(585480, 585520, 588e90, 0, 0, 0)
fffffc7fffdff2a0 libtopo.so.1`topo_mod_enumerate+0xc8(585480, 585520, 450e50, 588e90, 0, 0)
fffffc7fffdff2f0 libtopo.so.1`enum_run+0x94(5855c0, 57f0f0)
fffffc7fffdff360 libtopo.so.1`topo_xml_range_process+0x103(5855c0, 595ec0, 57f0f0)
fffffc7fffdff3e0 libtopo.so.1`tf_rdata_new+0x118(5855c0, 58dee0, 595ec0, 585520)
fffffc7fffdff470 libtopo.so.1`topo_xml_walk+0x10d(5855c0, 58dee0, 4254d0, 585520)
fffffc7fffdff4c0 libtopo.so.1`topo_xml_enum+0x5f(5855c0, 58dee0, 585520)
fffffc7fffdff640 libtopo.so.1`topo_file_load+0xe6(5855c0, 585520, 485fb8, 588ea0, 0)
fffffc7fffdff670 libtopo.so.1`topo_tree_enum+0x84(481f00, 574300)
fffffc7fffdff6b0 libtopo.so.1`topo_tree_enum_all+0x33(481f00)
fffffc7fffdff710 libtopo.so.1`topo_snap_create+0xb2(481f00, fffffc7fffdff7d0, 0)
fffffc7fffdff760 libtopo.so.1`topo_snap_hold+0xba(481f00, 0, fffffc7fffdff7d0)
fffffc7fffdffcc0 enumerate_disks+0xb9(fffffc7fffdffcd0)
fffffc7fffdffd00 main+0xf7(1, fffffc7fffdffd58)
fffffc7fffdffd30 _start_crt+0x83()
fffffc7fffdffd40 _start+0x18()
Actions #6

Updated by Electric Monk over 3 years ago

  • Status changed from New to Closed

git commit eb00b1c8a31c2253a353644606388dff5b0e0275

commit  eb00b1c8a31c2253a353644606388dff5b0e0275
Author: Robert Mustacchi <rm@joyent.com>
Date:   2020-03-24T22:27:39.000Z

    11609 Want modern Intel IMC driver
    11612 x86 PCI enumeration should not rely on bios max bus
    Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
    Reviewed by: Rob Johnston <rob.johnston@joyent.com>
    Approved by: Gordon Ross <gordon.w.ross@gmail.com>

Actions #7

Updated by Robert Mustacchi over 3 years ago

  • Related to Feature #12444: Intel v1 chip topo needs rank information added
Actions

Also available in: Atom PDF