Project

General

Profile

Actions

Bug #8945

closed

nvme panics when async events are not supported

Added by Michal Nowak about 5 years ago. Updated about 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
driver - device drivers
Start date:
2017-12-30
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:
External Bug:

Description

Originally reported here: https://www.illumos.org/issues/8808#note-2

openQA job here: https://openqa.oi.mnowak.cz/tests/2373

OI test build 20171229 on QEMU 2.11.0 from openSUSE 42.3 with two NVMe disks fails with:

panic[cpu0]/thread=ffffff00054d9c40: programming error: invalid opcode in cmd ffffff018b42fe40
Warning - stack not written to the dump buffer
ffffff00054d9ac0 genunix:dev_err+7b ()
ffffff00054d9af0 nvme:nvme_check_generic_cmd_status+55 ()
ffffff00054d9b60 nvme:nvme_async_event_task+1ac ()
ffffff00054d9c20 genunix:taskq_thread+2d0 ()
ffffff00054d9c30 unix:thread_start+8 ()

QEMU command line:

/usr/bin/qemu-kvm -serial file:serial0 -soundhw ac97 -vga cirrus -global isa-fdc.driveA= -m 2048 -cpu host -netdev user,id=qanet0 -device virtio-net,netdev=qanet0,mac=52:54:00:12:34:56 -device nvme,drive=hd1,serial=1 -drive file=raid/l1,cache=unsafe,if=none,id=hd1,format=qcow2 -device nvme,drive=hd2,serial=2 -drive file=raid/l2,cache=unsafe,if=none,id=hd2,format=qcow2 -drive media=cdrom,if=none,id=cd0,format=raw,file=/var/lib/openqa/pool/2/OI-hipster-minimal-20171229.iso -device ide-cd,drive=cd0 -boot once=d,menu=on,splash-time=5000 -device usb-ehci -device usb-tablet -smp 1 -enable-kvm -no-shutdown -vnc :92,share=force-shared -qmp unix:qmp_socket,server,nowait -monitor unix:hmp_socket,server,nowait -S -monitor telnet:127.0.0.1:20022,server,nowait


Files

nvme-invalid-opcode.jpg (66.1 KB) nvme-invalid-opcode.jpg Michal Nowak, 2017-12-30 11:29 AM

Related issues

Related to illumos gate - Bug #9846: nvme driver shouldn't panic from userland commandsClosedRobert Mustacchi2018-09-18

Actions
Actions #1

Updated by Toomas Soome about 5 years ago

The problem is that we issue ASYNC Event requests and qemu nvme does not implement those:

https://github.com/qemu/qemu/blob/v2.11.0/hw/block/nvme.c#L647

So we will get (NVME_INVALID_OPCODE | NVME_DNR) as return value and we will panic on invalid opcode.

Apparently the nvme driver will need to be updated accordingly.

Actions #2

Updated by Toomas Soome about 5 years ago

  • Subject changed from QEMU with NVMe: programming error: invalid opcode in cmd ffffff018b42fe40 to nvme panics when async events are not supported
  • Status changed from New to In Progress
  • Assignee set to Toomas Soome
  • % Done changed from 0 to 90
  • Tags deleted (needs-triage)

While the nvme specification does require async events to be supported, some versions of qemu nvme implementation does not support async events. As bad as it does sound, we can cope with such situation and avoid panic.

The fix is to detect the error response and disable posting async event requests - and issue the warning message.

Actions #3

Updated by Toomas Soome about 5 years ago

Toomas Soome wrote:

The problem is that we issue ASYNC Event requests and qemu nvme does not implement those:

https://github.com/qemu/qemu/blob/v2.11.0/hw/block/nvme.c#L647

So we will get (NVME_INVALID_OPCODE | NVME_DNR) as return value and we will panic on invalid opcode.

Apparently the nvme driver will need to be updated accordingly.

Filed bugreport for qemu: https://bugs.launchpad.net/qemu/+bug/1747393

Actions #4

Updated by Electric Monk about 5 years ago

  • Status changed from In Progress to Closed
  • % Done changed from 90 to 100

git commit 081391626072035c77f552902b50d1f7c1359700

commit  081391626072035c77f552902b50d1f7c1359700
Author: Toomas Soome <tsoome@me.com>
Date:   2018-02-14T19:02:41.000Z

    8945 nvme panics when async events are not supported
    Reviewed by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org>
    Reviewed by: Yuri Pankov <yuripv@yuripv.net>
    Reviewed by: Michal Nowak <mnowak@startmail.com>
    Approved by: Richard Lowe <richlowe@richlowe.net>

Actions #5

Updated by Robert Mustacchi over 4 years ago

  • Related to Bug #9846: nvme driver shouldn't panic from userland commands added
Actions

Also available in: Atom PDF