Bug #8808

nvme: Software Progress Marker feature is optional

Added by Michal Nowak 8 months ago. Updated 4 months ago.

Status:ClosedStart date:2017-11-16
Priority:NormalDue date:
Assignee:Yuri Pankov% Done:

100%

Category:driver - device drivers
Target version:-
Difficulty:Bite-size Tags:

Description

I attached nvme driver from illumos-45d3dd981a to emulated nvme0 device from VirtualBox 5.1.30 on Linux host:

root@openindiana:/root# update_drv -a -i "pci80ee,4e56" nvme
update_drv -a -i "pci80ee,4e56" nvme
Nov 16 23:25:57 openindiana pseudo: pseudo-device: devinfo0
Nov 16 23:25:57 openindiana genunix: devinfo0 is /pseudo/devinfo@0
Nov 16 23:25:57 openindiana nvme: nvme0: NVMe spec version 1.2
Nov 16 23:25:57 openindiana blkdev: Block device: blkdev@1,0, blkdev0
Nov 16 23:25:57 openindiana genunix: blkdev0 is /pci@0,0/pci80ee,4e56@e/blkdev@1,0
Nov 16 23:25:57 openindiana genunix: /pci@0,0/pci80ee,4e56@e/blkdev@1,0 (blkdev0) online
root@openindiana:/root# devfsadm -i nvme
devfsadm -i nvme

nvmeadm get-features crashes kernel with:
root@openindiana:/root# nvmeadm get-features nvme0
nvmeadm get-features nvme0
nvm
panic[cpu0]/thread=ffffff0189a99000: programming error: invalid field in cmd ffffff018d9e0840

Warning - stack not written to the dump buffer
ffffff000424aa30 genunix:dev_err+7b ()
ffffff000424aa60 nvme:nvme_check_generic_cmd_status+205 ()
ffffff000424aae0 nvme:nvme_get_features+2c7 ()
ffffff000424ab60 nvme:nvme_ioctl_get_features+c4 ()
ffffff000424ac80 nvme:nvme_ioctl+141 ()
ffffff000424acc0 genunix:cdev_ioctl+39 ()
ffffff000424ad10 specfs:spec_ioctl+60 ()
ffffff000424ada0 genunix:fop_ioctl+55 ()
ffffff000424aec0 genunix:ioctl+9b ()
ffffff000424af10 unix:brand_sys_sysenter+1c9 ()

skipping system dump - no dump device configured
e0: Get Featuresrebooti
ng...

Installation of OpenIndiana Minimal to nvme0 succeeds if I don't touch nvme get-features. (But OI won't boot but I guess that caused by lack of UEFI in illumos which is I guess needed for boot from NVMe device.)

prtconf (230 KB) Michal Nowak, 2018-01-08 06:28 AM

History

#1 Updated by Yuri Pankov 8 months ago

  • Subject changed from VirtualBox nvme: `nvmeadm get-features`: panic[cpu0]/thread=ffffff0189a99000: programming error: invalid field in cmd ffffff018d9e0840 to nvmeadm get-features: panic: programming error: invalid field in cmd

We are seeing this with real (non-emulated) NVMe devices (not sure which exactly at the moment, can provide the make/model if needed), updating the synopsis a bit.

#2 Updated by Michal Nowak 8 months ago

Similar error with NVMe from QEMU 2.10.1, but right during boot:

panic[cpu0]/thread=ffffff00054d9c40: programming error: invalid opcode in cmd ffffff018b42fe40
Warning - stack not written to the dump buffer
ffffff00054d9ac0 genunix:dev_err+7b ()
ffffff00054d9af0 nvme:nvme_check_generic_cmd_status+55 ()
ffffff00054d9b60 nvme:nvme_async_event_task+1ac ()
ffffff00054d9c20 genunix:taskq_thread+2d0 ()
ffffff00054d9c30 unix:thread_start+8 ()

Is it the same issue?

#3 Updated by Toomas Soome 7 months ago

Yuri Pankov wrote:

We are seeing this with real (non-emulated) NVMe devices (not sure which exactly at the moment, can provide the make/model if needed), updating the synopsis a bit.

In case of qemu, we will get panic there because qemu nvme does implement only NVME_VOLATILE_WRITE_CACHE and NVME_NUMBER_OF_QUEUES, and we will get NVME_CQE_SC_GEN_INV_FLD | DNR as status code.

As this also does happen with real hardware, it is quite clear that we should process the error condition and avoid the panic...

#4 Updated by Michal Nowak 6 months ago

I can see the same "programming error: invalid field" panic on VMware Workstation 14.1.0-7370693 on Linux host with NVMe in the VM on sudo nvmeadm get-features nvme0:

Likely an incomplete output:

nvme0: Get Features
  Arbitration
    Arbitration Burst:                      1
    Low Priority Weight:                    1
    Medium Priority Weight:                 1
    High Priority Weight:                   1
  Power Management
    Power State:                            0
  Temperature Threshold
    Temperature Threshold:                  -273C
  Error Recovery
    Time Limited Error Recovery:            no time limit
  Number of Queues
    Number of Submission Queues:            1
    Number of Completion Queues:            1
  Interrupt Coalescing
    Aggregation Threshold:                  1
    Aggregation Time:                       0us
  Interrupt Vector Configuration
    Vector 0 Coalescing Disable:            no
  Write Atomicity
    Disable Normal:                         no
  Asynchronous Event Configuration
    Available Space below threshold:        disabled

#5 Updated by Yuri Pankov 5 months ago

  • Subject changed from nvmeadm get-features: panic: programming error: invalid field in cmd to nvme: Software Progress Marker feature is optional
  • Assignee set to Yuri Pankov
  • Tags deleted (needs-triage)
  • Difficulty changed from Medium to Bite-size
  • % Done changed from 0 to 50

Making nvme don't panic on receiving bad answer in nvme_get_features(), I see that both VMware's NVMe emulation and some Seagate NVMe (don't know the exact model at the moment) don't support the "Software Progress Marker" feature defined as optional in all NVMe specification versions, so we should treat it as such:

Feb 23 04:15:13 antares nvme: [ID 856401 kern.warning] WARNING: nvme0: GET FEATURES 128 failed with sct = 0, sc = 2

#6 Updated by Electric Monk 4 months ago

  • % Done changed from 50 to 100
  • Status changed from New to Closed

git commit f313c178df05fb723db8426641b6f443f90f5f45

commit  f313c178df05fb723db8426641b6f443f90f5f45
Author: Yuri Pankov <yuri.pankov@nexenta.com>
Date:   2018-03-14T20:08:18.000Z

    8808 nvme: Software Progress Marker feature is optional
    Reviewed by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org>
    Reviewed by: Robert Mustacchi <rm@joyent.com>
    Reviewed by: Dan Fields <dan.fields@nexenta.com>
    Reviewed by: Cynthia Eastham <cynthia.eastham@nexenta.com>
    Approved by: Gordon Ross <gwr@nexenta.com>

Also available in: Atom