Project

General

Profile

Feature #11959

extend disk topo plugin to enumerate nvme devices

Added by Rob Johnston 5 months ago. Updated 29 days ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:

Description

The disk topo plugin can be invoked under a "bay" topo node in order to enumerate the disk occupant, if one is present. The bay nodes contain a set of properties which provide hints to the disk plugin for how to discover if a disk is present. Today this mechanism works for the following HW configurations:

  • direct-attached SATA drives
  • direct-attached SAS drives
  • SAS drives behind a SAS expander that supports SCSI Enclosure Services (SES)

For the direct-attached cases, a topo map file must be provided, which statically defines the bay nodes with the hint properties required by the disk topo plugin.
For the "SAS drives behind a SAS expander that supports SES" case, the ses topo plugin dynamically enumerates the bay nodes.

This ticket is to cover the work to extend the disk topo plugin to enumerate direct-attached NVMe devices.

The approach will be that an NVMe device will be represented by a single child "nvme" node. The "nvme" node will then have N child "disk" nodes where N is the number of namespaces configured on the nvme device.

History

#1

Updated by Rob Johnston 5 months ago

  • Assignee set to Rob Johnston
#2

Updated by Rob Johnston about 1 month ago

Testing

I onu'd this changes onto a Supermicro rig that I installed with 4 U.2 NVME devices (all different drive models) and an Intel PCIE add-in NVMe card. Then I compared the details in the topology enumerated by libtopo with the details reported by nvmeadm(1m). I physically verified that the FRU location reported by libtopo for the NVME devices was correct.

Finally I ran fmtopo with UMEM_DEBUG enabled and verified that there were no aborts and no memory leaks introduced by these changes.

Output is captured below:

#usr/lib/fm/fmd/fmtopo -V "*nvme=*" 
TIME                 UUID
Mar 06 22:20:44 5028bdf6-bc30-c44a-b5b7-e3562cb16982

hc://:product-id=SYS-2028U-E1CNRT+:server-id=nvme:chassis-id=S180455X6A38661:serial=CVMD53950044400AGN:part=INTEL-SSDPEDME400G4:revision=8DV10171/motherboard=0/hostbridge=13/pciexrc=13/pciexbus=133/pciexdev=0/pciexfn=0/nvme=0
  group: protocol                       version: 1   stability: Private/Private
    resource          fmri      hc://:product-id=SYS-2028U-E1CNRT+:server-id=nvme:chassis-id=S180455X6A38661:serial=CVMD53950044400AGN:part=INTEL-SSDPEDME400G4:revision=8DV10171/motherboard=0/hostbridge=13/pciexrc=13/pciexbus=133/pciexdev=0/pciexfn=0/nvme=0
    FRU               fmri      hc://:product-id=SYS-2028U-E1CNRT+:server-id=nvme:chassis-id=S180455X6A38661/motherboard=0/hostbridge=13/pciexrc=13/pciexbus=133/pciexdev=0
    label             string    RSC-R2UW-4E8 SLOT3 PCI-E X8
  group: authority                      version: 1   stability: Private/Private
    product-id        string    SYS-2028U-E1CNRT+
    chassis-id        string    S180455X6A38661
    server-id         string    nvme
  group: nvme-properties                version: 1   stability: Private/Private
    nvme-version      string    1.0
  group: io                             version: 1   stability: Private/Private
    devfs-path        string    /pci@76,0/pci8086,6f08@3/pci8086,3709@0:devctl

hc://:product-id=SYS-2028U-E1CNRT+:server-id=nvme:chassis-id=S180455X6A38661:serial=CVMD53950044400AGN:part=SSDPEDME400G4:revision=8DV10171/motherboard=0/hostbridge=13/pciexrc=13/pciexbus=133/pciexdev=0/pciexfn=0/nvme=0/disk=0
  group: protocol                       version: 1   stability: Private/Private
    resource          fmri      hc://:product-id=SYS-2028U-E1CNRT+:server-id=nvme:chassis-id=S180455X6A38661:serial=CVMD53950044400AGN:part=SSDPEDME400G4:revision=8DV10171/motherboard=0/hostbridge=13/pciexrc=13/pciexbus=133/pciexdev=0/pciexfn=0/nvme=0/disk=0
    FRU               fmri      hc://:product-id=SYS-2028U-E1CNRT+:server-id=nvme:chassis-id=S180455X6A38661/motherboard=0/hostbridge=13/pciexrc=13/pciexbus=133/pciexdev=0
  group: authority                      version: 1   stability: Private/Private
    product-id        string    SYS-2028U-E1CNRT+
    chassis-id        string    S180455X6A38661
    server-id         string    nvme
  group: system                         version: 1   stability: Private/Private
    isa               string    i386
    machine           string    i86pc
  group: io                             version: 1   stability: Private/Private
    devfs-path        string    /pci@76,0/pci8086,6f08@3/pci8086,3709@0/blkdev@1,0
    phys-path         string[]  [ "/pci@76,0/pci8086,6f08@3/pci8086,3709@0/blkdev@1,0" ]
    devid             string    id1,kdev@E8086-INTEL_SSDPEDME400G4_____________________-CVMD53950044400AGN__-1
  group: storage                        version: 1   stability: Private/Private
    manufacturer      string    INTEL
    capacity-in-bytes string    400088457216
    serial-number     string    CVMD53950044400AGN
    model             string    SSDPEDME400G4
    firmware-revision string    8DV10171
    logical-disk      string    c12t1d0

hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=PHLE724300E11P6CGN:part=INTEL-SSDPE2KE016T7:revision=QDV10130/ses-enclosure=0/bay=20/nvme=0
  group: protocol                       version: 1   stability: Private/Private
    resource          fmri      hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=PHLE724300E11P6CGN:part=INTEL-SSDPE2KE016T7:revision=QDV10130/ses-enclosure=0/bay=20/nvme=0
    FRU               fmri      hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=PHLE724300E11P6CGN:part=INTEL-SSDPE2KE016T7:revision=QDV10130/ses-enclosure=0/bay=20/nvme=0
    label             string    Slot20
  group: authority                      version: 1   stability: Private/Private
    product-id        string    LSI-SAS3x40
    chassis-id        string    500304801ee3123f
    server-id         string    
  group: nvme-properties                version: 1   stability: Private/Private
    nvme-version      string    1.2
  group: io                             version: 1   stability: Private/Private
    devfs-path        string    /pci@0,0/pci8086,6f08@3/pci8086,4712@0:devctl

hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=PHLE724300E11P6CGN:part=SSDPE2KE016T7:revision=QDV10130/ses-enclosure=0/bay=20/nvme=0/disk=0
  group: protocol                       version: 1   stability: Private/Private
    resource          fmri      hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=PHLE724300E11P6CGN:part=SSDPE2KE016T7:revision=QDV10130/ses-enclosure=0/bay=20/nvme=0/disk=0
    FRU               fmri      hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=PHLE724300E11P6CGN:part=INTEL-SSDPE2KE016T7:revision=QDV10130/ses-enclosure=0/bay=20/nvme=0
  group: authority                      version: 1   stability: Private/Private
    product-id        string    LSI-SAS3x40
    chassis-id        string    500304801ee3123f
    server-id         string    
  group: system                         version: 1   stability: Private/Private
    isa               string    i386
    machine           string    i86pc
  group: io                             version: 1   stability: Private/Private
    devfs-path        string    /pci@0,0/pci8086,6f08@3/pci8086,4712@0/blkdev@1,0
    phys-path         string[]  [ "/pci@0,0/pci8086,6f08@3/pci8086,4712@0/blkdev@1,0" ]
    devid             string    id1,kdev@E8086-INTEL_SSDPE2KE016T7_____________________-PHLE724300E11P6CGN__-1
  group: storage                        version: 1   stability: Private/Private
    manufacturer      string    INTEL
    capacity-in-bytes string    1600321314816
    serial-number     string    PHLE724300E11P6CGN
    model             string    SSDPE2KE016T7
    firmware-revision string    QDV10130
    logical-disk      string    c5t1d0

hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=PHLE7244006R1P6CGN:part=INTEL-SSDPE2KE016T7:revision=QDV10130/ses-enclosure=0/bay=21/nvme=0
  group: protocol                       version: 1   stability: Private/Private
    resource          fmri      hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=PHLE7244006R1P6CGN:part=INTEL-SSDPE2KE016T7:revision=QDV10130/ses-enclosure=0/bay=21/nvme=0
    FRU               fmri      hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=PHLE7244006R1P6CGN:part=INTEL-SSDPE2KE016T7:revision=QDV10130/ses-enclosure=0/bay=21/nvme=0
    label             string    Slot21
  group: authority                      version: 1   stability: Private/Private
    product-id        string    LSI-SAS3x40
    chassis-id        string    500304801ee3123f
    server-id         string    
  group: nvme-properties                version: 1   stability: Private/Private
    nvme-version      string    1.2
  group: io                             version: 1   stability: Private/Private
    devfs-path        string    /pci@0,0/pci8086,6f09@3,1/pci8086,4712@0:devctl

hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=PHLE7244006R1P6CGN:part=SSDPE2KE016T7:revision=QDV10130/ses-enclosure=0/bay=21/nvme=0/disk=0
  group: protocol                       version: 1   stability: Private/Private
    resource          fmri      hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=PHLE7244006R1P6CGN:part=SSDPE2KE016T7:revision=QDV10130/ses-enclosure=0/bay=21/nvme=0/disk=0
    FRU               fmri      hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=PHLE7244006R1P6CGN:part=INTEL-SSDPE2KE016T7:revision=QDV10130/ses-enclosure=0/bay=21/nvme=0
  group: authority                      version: 1   stability: Private/Private
    product-id        string    LSI-SAS3x40
    chassis-id        string    500304801ee3123f
    server-id         string    
  group: system                         version: 1   stability: Private/Private
    isa               string    i386
    machine           string    i86pc
  group: io                             version: 1   stability: Private/Private
    devfs-path        string    /pci@0,0/pci8086,6f09@3,1/pci8086,4712@0/blkdev@1,0
    phys-path         string[]  [ "/pci@0,0/pci8086,6f09@3,1/pci8086,4712@0/blkdev@1,0" ]
    devid             string    id1,kdev@E8086-INTEL_SSDPE2KE016T7_____________________-PHLE7244006R1P6CGN__-1
  group: storage                        version: 1   stability: Private/Private
    manufacturer      string    INTEL
    capacity-in-bytes string    1600321314816
    serial-number     string    PHLE7244006R1P6CGN
    model             string    SSDPE2KE016T7
    firmware-revision string    QDV10130
    logical-disk      string    c6t1d0

hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=PHFT616200452P0KGN:part=INTEL-SSDPE2MD020T4:revision=8DV10171/ses-enclosure=0/bay=22/nvme=0
  group: protocol                       version: 1   stability: Private/Private
    resource          fmri      hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=PHFT616200452P0KGN:part=INTEL-SSDPE2MD020T4:revision=8DV10171/ses-enclosure=0/bay=22/nvme=0
    FRU               fmri      hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=PHFT616200452P0KGN:part=INTEL-SSDPE2MD020T4:revision=8DV10171/ses-enclosure=0/bay=22/nvme=0
    label             string    Slot22
  group: authority                      version: 1   stability: Private/Private
    product-id        string    LSI-SAS3x40
    chassis-id        string    500304801ee3123f
    server-id         string    
  group: nvme-properties                version: 1   stability: Private/Private
    nvme-version      string    1.0
  group: io                             version: 1   stability: Private/Private
    devfs-path        string    /pci@0,0/pci8086,6f0a@3,2/pci8086,3703@0:devctl

hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=PHFT616200452P0KGN:part=SSDPE2MD020T4:revision=8DV10171/ses-enclosure=0/bay=22/nvme=0/disk=0
  group: protocol                       version: 1   stability: Private/Private
    resource          fmri      hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=PHFT616200452P0KGN:part=SSDPE2MD020T4:revision=8DV10171/ses-enclosure=0/bay=22/nvme=0/disk=0
    FRU               fmri      hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=PHFT616200452P0KGN:part=INTEL-SSDPE2MD020T4:revision=8DV10171/ses-enclosure=0/bay=22/nvme=0
  group: authority                      version: 1   stability: Private/Private
    product-id        string    LSI-SAS3x40
    chassis-id        string    500304801ee3123f
    server-id         string    
  group: system                         version: 1   stability: Private/Private
    isa               string    i386
    machine           string    i86pc
  group: io                             version: 1   stability: Private/Private
    devfs-path        string    /pci@0,0/pci8086,6f0a@3,2/pci8086,3703@0/blkdev@1,0
    phys-path         string[]  [ "/pci@0,0/pci8086,6f0a@3,2/pci8086,3703@0/blkdev@1,0" ]
    devid             string    id1,kdev@E8086-INTEL_SSDPE2MD020T4_____________________-PHFT616200452P0KGN__-1
  group: storage                        version: 1   stability: Private/Private
    manufacturer      string    INTEL
    capacity-in-bytes string    2000398934016
    serial-number     string    PHFT616200452P0KGN
    model             string    SSDPE2MD020T4
    firmware-revision string    8DV10171
    logical-disk      string    c7t1d0

hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=S35XNX0J300050:part=SAMSUNG-MZQLW960HMJP-00003:revision=CXV8301Q/ses-enclosure=0/bay=23/nvme=0
  group: protocol                       version: 1   stability: Private/Private
    resource          fmri      hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=S35XNX0J300050:part=SAMSUNG-MZQLW960HMJP-00003:revision=CXV8301Q/ses-enclosure=0/bay=23/nvme=0
    FRU               fmri      hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=S35XNX0J300050:part=SAMSUNG-MZQLW960HMJP-00003:revision=CXV8301Q/ses-enclosure=0/bay=23/nvme=0
    label             string    Slot23
  group: authority                      version: 1   stability: Private/Private
    product-id        string    LSI-SAS3x40
    chassis-id        string    500304801ee3123f
    server-id         string    
  group: nvme-properties                version: 1   stability: Private/Private
    nvme-version      string    1.2
  group: io                             version: 1   stability: Private/Private
    devfs-path        string    /pci@0,0/pci8086,6f0b@3,3/pci144d,a801@0:devctl

hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=S35XNX0J300050:part=MZQLW960HMJP-00003:revision=CXV8301Q/ses-enclosure=0/bay=23/nvme=0/disk=0
  group: protocol                       version: 1   stability: Private/Private
    resource          fmri      hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=S35XNX0J300050:part=MZQLW960HMJP-00003:revision=CXV8301Q/ses-enclosure=0/bay=23/nvme=0/disk=0
    FRU               fmri      hc://:product-id=LSI-SAS3x40:server-id=:chassis-id=500304801ee3123f:serial=S35XNX0J300050:part=SAMSUNG-MZQLW960HMJP-00003:revision=CXV8301Q/ses-enclosure=0/bay=23/nvme=0
  group: authority                      version: 1   stability: Private/Private
    product-id        string    LSI-SAS3x40
    chassis-id        string    500304801ee3123f
    server-id         string    
  group: system                         version: 1   stability: Private/Private
    isa               string    i386
    machine           string    i86pc
  group: io                             version: 1   stability: Private/Private
    devfs-path        string    /pci@0,0/pci8086,6f0b@3,3/pci144d,a801@0/blkdev@1,0
    phys-path         string[]  [ "/pci@0,0/pci8086,6f0b@3,3/pci144d,a801@0/blkdev@1,0" ]
    devid             string    id1,kdev@E144D-SAMSUNG_MZQLW960HMJP-00003______________-S35XNX0J300050______-1
  group: storage                        version: 1   stability: Private/Private
    manufacturer      string    SAMSUNG
    capacity-in-bytes string    960197124096
    serial-number     string    S35XNX0J300050
    model             string    MZQLW960HMJP-00003
    firmware-revision string    CXV8301Q
    logical-disk      string    c8t1d0
# nvmeadm list
nvme0: model: INTEL SSDPE2KE016T7, serial: PHLE724300E11P6CGN, FW rev: QDV10130, NVMe v1.2
  nvme0/1 (c5t1d0): Size = 1526185 MB, Capacity = 1526185 MB, Used = 1526185 MB
nvme1: model: INTEL SSDPE2KE016T7, serial: PHLE7244006R1P6CGN, FW rev: QDV10130, NVMe v1.2
  nvme1/1 (c6t1d0): Size = 1526185 MB, Capacity = 1526185 MB, Used = 1526185 MB
nvme2: model: INTEL SSDPE2MD020T4, serial: PHFT616200452P0KGN, FW rev: 8DV10171, NVMe v1.0
  nvme2/1 (c7t1d0): Size = 1907729 MB, Capacity = 1907729 MB, Used = 1907729 MB
nvme3: model: SAMSUNG MZQLW960HMJP-00003, serial: S35XNX0J300050, FW rev: CXV8301Q, NVMe v1.2
  nvme3/1 (c8t1d0): Size = 915715 MB, Capacity = 915715 MB, Used = 902859 MB
nvme4: model: INTEL SSDPEDME400G4, serial: CVMD53950044400AGN, FW rev: 8DV10171, NVMe v1.0
  nvme4/1 (c12t1d0): Size = 381554 MB, Capacity = 381554 MB, Used = 381554 MB
root@nvme:/var/tmp# diskinfo -P
# export UMEM_DEBUG=default
# /usr/lib/fm/fmd/fmtopo -V -C >/dev/null
/usr/lib/fm/fmd/fmtopo: failed to get properties for locate=0: method failed
/usr/lib/fm/fmd/fmtopo: failed to get properties for temp=0: unknown libtopo error
/usr/lib/fm/fmd/fmtopo: failed to get properties for temp=0: unknown libtopo error
Abort (core dumped)
root@nvme:/var/tmp# ls
core       topoV.out
root@nvme:/var/tmp# mdb core 
Loading modules: [ libc.so.1 libtopo.so.1 libumem.so.1 libnvpair.so.1 libuutil.so.1 libavl.so.1 ld.so.1 ]
> ::findleaks -d 
BYTES             LEAKED VMEM_SEG CALLER
4096                   2 fd7a0000 MMAP
4096                   1 fdb2a000 MMAP
------------------------------------------------------------------------
           Total       2 oversized leaks, 8192 bytes

mmap(2) leak: [fd7a0000, fd7a1000), 4096 bytes
mmap(2) leak: [fdb2a000, fdb2b000), 4096 bytes
> $q
#3

Updated by Electric Monk 29 days ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit 3c6ffbab91273559b511d95f850d7b2d9cd2a3c5

commit  3c6ffbab91273559b511d95f850d7b2d9cd2a3c5
Author: Rob Johnston <rob.johnston@joyent.com>
Date:   2020-03-11T15:01:02.000Z

    11958 need topo maps for the SMCI,SYS-2028U-E1CNRT+
    11959 extend disk topo plugin to enumerate nvme devices
    Reviewed by: Robert Mustacchi <rm@fingolfin.org>
    Approved by: Dan McDonald <danmcd@joyent.com>

Also available in: Atom PDF