Project

General

Profile

Actions

Bug #16593

open

nvme panic when committing partially loaded firmware

Added by Andy Fiddaman 12 days ago. Updated 12 days ago.

Status:
In Progress
Priority:
Normal
Assignee:
Category:
driver - device drivers
Start date:
Due date:
% Done:

0%

Estimated time:
Difficulty:
Bite-size
Tags:
Gerrit CR:
External Bug:

Description

When debugging #16592 I managed to panic the system by attempting to commit
partially loaded nvme firmware.

gimlet-sn06 # nvmeadm load-firmware nvme0 /tmp/micron_7300_fw-95420280_m2.tar
nvmeadm: failed to load firmware image "/tmp/micron_7300_fw-95420280_m2.tar" at offset 7012352:
field number of dwords (numd) value 0xaa00 is invalid: NVME_ERR_FW_LOAD_LEN_RANGE
(libnvme: 0x38, sys: 0)

gimlet-sn06 # nvmeadm commit-firmware 3
panic[cpu111]/thread=fffffd047d2a13c0: assertion failed:
cmd->nc_sqe.sqe_opc == NVME_OPC_FW_IMAGE_LOAD,
file: ../../common/io/nvme/nvme.c, line: 2153
Actions #1

Updated by Andy Fiddaman 12 days ago

> $C
fffff7880eac1a40 vpanic()
fffff7880eac1a90 0xfffffffffc118ebd()
fffff7880eac1ad0 nvme_check_specific_cmd_status+0x5c2(fffffd0494c98100)
fffff7880eac1b00 nvme_check_cmd_status_ioctl+0xc5(fffffd0494c98100, fffff7880eac1b70)
fffff7880eac1b50 nvme_ioc_cmd+0xf4(fffffd02ece0b1c0, fffff7880eac1b70, fffff7880eac1b90)
fffff7880eac1c50 nvme_ioctl_firmware_commit+0x1b1(fffffd04d5ccc100, fffff5ffffdfcdb0, 202003, fffffd047ed6dad0)
fffff7880eac1cb0 nvme_ioctl+0x2ae(5400040040, 4e564d08, fffff5ffffdfcdb0, 202003, fffffd047ed6dad0, fffff7880eac1e08)
fffff7880eac1cf0 cdev_ioctl+0x3f(5400040040, 4e564d08, fffff5ffffdfcdb0, 202003, fffffd047ed6dad0, fffff7880eac1e08)
fffff7880eac1d40 spec_ioctl+0x55(fffffd04d847ba00, 4e564d08, fffff5ffffdfcdb0, 202003, fffffd047ed6dad0, fffff7880eac1e08, 0)
fffff7880eac1dd0 fop_ioctl+0x40(fffffd04d847ba00, 4e564d08, fffff5ffffdfcdb0, 202003, fffffd047ed6dad0, fffff7880eac1e08, 0)
fffff7880eac1ef0 ioctl+0x144(3, 4e564d08, fffff5ffffdfcdb0)
fffff7880eac1f00 sys_syscall+0x283()
> fffffd0494c98100::print nvme_cmd_t nc_sqe.sqe_opc nc_cqe.cqe_sf.sf_sc
nc_sqe.sqe_opc = 0x10
nc_cqe.cqe_sf.sf_sc = 0x14

So, apparently, this controller responded to NVME_OPC_FW_ACTIVATE with
NVME_CQE_SC_SPC_FW_OVERLAP when the firmware image was only partially
loaded.

This is a valid status for this operation per Figure 184 of the NVMe Base
Specification, Revision 2.0d. We should not panic when it occurs, even in
DEBUG.

Actions #2

Updated by Electric Monk 12 days ago

  • Gerrit CR set to 3538
Actions

Also available in: Atom PDF