Bug #8945
closednvme panics when async events are not supported
100%
Description
Originally reported here: https://www.illumos.org/issues/8808#note-2
openQA job here: https://openqa.oi.mnowak.cz/tests/2373
OI test build 20171229 on QEMU 2.11.0 from openSUSE 42.3 with two NVMe disks fails with:
panic[cpu0]/thread=ffffff00054d9c40: programming error: invalid opcode in cmd ffffff018b42fe40 Warning - stack not written to the dump buffer ffffff00054d9ac0 genunix:dev_err+7b () ffffff00054d9af0 nvme:nvme_check_generic_cmd_status+55 () ffffff00054d9b60 nvme:nvme_async_event_task+1ac () ffffff00054d9c20 genunix:taskq_thread+2d0 () ffffff00054d9c30 unix:thread_start+8 ()
QEMU command line:
/usr/bin/qemu-kvm -serial file:serial0 -soundhw ac97 -vga cirrus -global isa-fdc.driveA= -m 2048 -cpu host -netdev user,id=qanet0 -device virtio-net,netdev=qanet0,mac=52:54:00:12:34:56 -device nvme,drive=hd1,serial=1 -drive file=raid/l1,cache=unsafe,if=none,id=hd1,format=qcow2 -device nvme,drive=hd2,serial=2 -drive file=raid/l2,cache=unsafe,if=none,id=hd2,format=qcow2 -drive media=cdrom,if=none,id=cd0,format=raw,file=/var/lib/openqa/pool/2/OI-hipster-minimal-20171229.iso -device ide-cd,drive=cd0 -boot once=d,menu=on,splash-time=5000 -device usb-ehci -device usb-tablet -smp 1 -enable-kvm -no-shutdown -vnc :92,share=force-shared -qmp unix:qmp_socket,server,nowait -monitor unix:hmp_socket,server,nowait -S -monitor telnet:127.0.0.1:20022,server,nowait
Files
Related issues
Updated by Toomas Soome about 5 years ago
The problem is that we issue ASYNC Event requests and qemu nvme does not implement those:
https://github.com/qemu/qemu/blob/v2.11.0/hw/block/nvme.c#L647
So we will get (NVME_INVALID_OPCODE | NVME_DNR) as return value and we will panic on invalid opcode.
Apparently the nvme driver will need to be updated accordingly.
Updated by Toomas Soome about 5 years ago
- Subject changed from QEMU with NVMe: programming error: invalid opcode in cmd ffffff018b42fe40 to nvme panics when async events are not supported
- Status changed from New to In Progress
- Assignee set to Toomas Soome
- % Done changed from 0 to 90
- Tags deleted (
needs-triage)
While the nvme specification does require async events to be supported, some versions of qemu nvme implementation does not support async events. As bad as it does sound, we can cope with such situation and avoid panic.
The fix is to detect the error response and disable posting async event requests - and issue the warning message.
Updated by Toomas Soome about 5 years ago
Toomas Soome wrote:
The problem is that we issue ASYNC Event requests and qemu nvme does not implement those:
https://github.com/qemu/qemu/blob/v2.11.0/hw/block/nvme.c#L647
So we will get (NVME_INVALID_OPCODE | NVME_DNR) as return value and we will panic on invalid opcode.
Apparently the nvme driver will need to be updated accordingly.
Filed bugreport for qemu: https://bugs.launchpad.net/qemu/+bug/1747393
Updated by Electric Monk about 5 years ago
- Status changed from In Progress to Closed
- % Done changed from 90 to 100
git commit 081391626072035c77f552902b50d1f7c1359700
commit 081391626072035c77f552902b50d1f7c1359700 Author: Toomas Soome <tsoome@me.com> Date: 2018-02-14T19:02:41.000Z 8945 nvme panics when async events are not supported Reviewed by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org> Reviewed by: Yuri Pankov <yuripv@yuripv.net> Reviewed by: Michal Nowak <mnowak@startmail.com> Approved by: Richard Lowe <richlowe@richlowe.net>
Updated by Robert Mustacchi over 4 years ago
- Related to Bug #9846: nvme driver shouldn't panic from userland commands added