Bug #8808

nvmeadm get-features: panic: programming error: invalid field in cmd

Added by Michal Nowak 2 months ago. Updated 13 days ago.

Status:NewStart date:2017-11-16
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:driver - device drivers
Target version:-
Difficulty:Medium Tags:needs-triage

Description

I attached nvme driver from illumos-45d3dd981a to emulated nvme0 device from VirtualBox 5.1.30 on Linux host:

root@openindiana:/root# update_drv -a -i "pci80ee,4e56" nvme
update_drv -a -i "pci80ee,4e56" nvme
Nov 16 23:25:57 openindiana pseudo: pseudo-device: devinfo0
Nov 16 23:25:57 openindiana genunix: devinfo0 is /pseudo/devinfo@0
Nov 16 23:25:57 openindiana nvme: nvme0: NVMe spec version 1.2
Nov 16 23:25:57 openindiana blkdev: Block device: blkdev@1,0, blkdev0
Nov 16 23:25:57 openindiana genunix: blkdev0 is /pci@0,0/pci80ee,4e56@e/blkdev@1,0
Nov 16 23:25:57 openindiana genunix: /pci@0,0/pci80ee,4e56@e/blkdev@1,0 (blkdev0) online
root@openindiana:/root# devfsadm -i nvme
devfsadm -i nvme

nvmeadm get-features crashes kernel with:
root@openindiana:/root# nvmeadm get-features nvme0
nvmeadm get-features nvme0
nvm
panic[cpu0]/thread=ffffff0189a99000: programming error: invalid field in cmd ffffff018d9e0840

Warning - stack not written to the dump buffer
ffffff000424aa30 genunix:dev_err+7b ()
ffffff000424aa60 nvme:nvme_check_generic_cmd_status+205 ()
ffffff000424aae0 nvme:nvme_get_features+2c7 ()
ffffff000424ab60 nvme:nvme_ioctl_get_features+c4 ()
ffffff000424ac80 nvme:nvme_ioctl+141 ()
ffffff000424acc0 genunix:cdev_ioctl+39 ()
ffffff000424ad10 specfs:spec_ioctl+60 ()
ffffff000424ada0 genunix:fop_ioctl+55 ()
ffffff000424aec0 genunix:ioctl+9b ()
ffffff000424af10 unix:brand_sys_sysenter+1c9 ()

skipping system dump - no dump device configured
e0: Get Featuresrebooti
ng...

Installation of OpenIndiana Minimal to nvme0 succeeds if I don't touch nvme get-features. (But OI won't boot but I guess that caused by lack of UEFI in illumos which is I guess needed for boot from NVMe device.)

prtconf (230 KB) Michal Nowak, 2018-01-08 06:28 AM

History

#1 Updated by Yuri Pankov 2 months ago

  • Subject changed from VirtualBox nvme: `nvmeadm get-features`: panic[cpu0]/thread=ffffff0189a99000: programming error: invalid field in cmd ffffff018d9e0840 to nvmeadm get-features: panic: programming error: invalid field in cmd

We are seeing this with real (non-emulated) NVMe devices (not sure which exactly at the moment, can provide the make/model if needed), updating the synopsis a bit.

#2 Updated by Michal Nowak 2 months ago

Similar error with NVMe from QEMU 2.10.1, but right during boot:

panic[cpu0]/thread=ffffff00054d9c40: programming error: invalid opcode in cmd ffffff018b42fe40
Warning - stack not written to the dump buffer
ffffff00054d9ac0 genunix:dev_err+7b ()
ffffff00054d9af0 nvme:nvme_check_generic_cmd_status+55 ()
ffffff00054d9b60 nvme:nvme_async_event_task+1ac ()
ffffff00054d9c20 genunix:taskq_thread+2d0 ()
ffffff00054d9c30 unix:thread_start+8 ()

Is it the same issue?

#3 Updated by Toomas Soome 15 days ago

Yuri Pankov wrote:

We are seeing this with real (non-emulated) NVMe devices (not sure which exactly at the moment, can provide the make/model if needed), updating the synopsis a bit.

In case of qemu, we will get panic there because qemu nvme does implement only NVME_VOLATILE_WRITE_CACHE and NVME_NUMBER_OF_QUEUES, and we will get NVME_CQE_SC_GEN_INV_FLD | DNR as status code.

As this also does happen with real hardware, it is quite clear that we should process the error condition and avoid the panic...

#4 Updated by Michal Nowak 13 days ago

I can see the same "programming error: invalid field" panic on VMware Workstation 14.1.0-7370693 on Linux host with NVMe in the VM on sudo nvmeadm get-features nvme0:

Likely an incomplete output:

nvme0: Get Features
  Arbitration
    Arbitration Burst:                      1
    Low Priority Weight:                    1
    Medium Priority Weight:                 1
    High Priority Weight:                   1
  Power Management
    Power State:                            0
  Temperature Threshold
    Temperature Threshold:                  -273C
  Error Recovery
    Time Limited Error Recovery:            no time limit
  Number of Queues
    Number of Submission Queues:            1
    Number of Completion Queues:            1
  Interrupt Coalescing
    Aggregation Threshold:                  1
    Aggregation Time:                       0us
  Interrupt Vector Configuration
    Vector 0 Coalescing Disable:            no
  Write Atomicity
    Disable Normal:                         no
  Asynchronous Event Configuration
    Available Space below threshold:        disabled

Also available in: Atom