Bug #9772
closedPanic in ahci when the failed slot spkt is NULL
100%
Description
> $C
fffffe0005ce6a30 ahci_dump_commands+0x77(fffffe04e7dda740, 0, 80000000)
fffffe0005ce6ad0 ahci_intr_fatal_error+0x2d6(fffffe04e7dda740, fffffe04e7de3940, 0, 40000000)
fffffe0005ce6b40 ahci_port_intr+0x229(fffffe04e7dda740, fffffe04e7de3940, 0)
fffffe0005ce6b80 ahci_intr+0xb8(fffffe04e7dda740, 0)
fffffe0005ce6bf0 apix_dispatch_pending_autovect+0x101(5)
fffffe0005ce6c20 apix_dispatch_pending_hardint+0x34(0, 0)
fffffe00070364b0 switch_sp_and_call+0x13()
> ::status debugging crash dump vmcore.8 (64-bit) from napp-it-026 operating system: 5.11 omnios-r151026-b6848f4455 (i86pc) image uuid: ff50b83b-96a2-ca7f-a98b-fa7c47b82476 panic message: BAD TRAP: type=e (#pf Page fault) rp=fffffe0005ce6820 addr=48 occurred in module "ahci" due to a NULL pointer dereference dump content: kernel pages only (curproc requested, but a kernel thread panicked) > ::sata_dmsg_dump [2018 Aug 27 17:49:10:765:741:663] ahci0: ahci_intr_fatal_error: port 0 task_file_status = 0x441 [2018 Aug 27 17:49:10:765:742:666] ahci0: ahci_intr_fatal_error: spkt 0x0 is being processed when fatal error occurred for port 0
Note spkt 0x0 in the last line. The cause of this is attempting to dump information about a failed command when the failed slot spkt is NULL. ahci_dump_commands() does not check for this even though most of the other code in ahci has explicit checks around this.
This was reported with OmniOS r151026 running under ESXi 6.7
static void ahci_dump_commands(ahci_ctl_t *ahci_ctlp, uint8_t port, uint32_t slot_tags) { ahci_port_t *ahci_portp; int tmp_slot; sata_pkt_t *spkt; sata_cmd_t cmd; ahci_portp = ahci_ctlp->ahcictl_ports[port]; ASSERT(ahci_portp != NULL); while (slot_tags) { tmp_slot = ddi_ffs(slot_tags) - 1; if (tmp_slot == -1) { break; } spkt = ahci_portp->ahciport_slot_pkts[tmp_slot]; ASSERT(spkt != NULL); cmd = spkt->satapkt_cmd;
> $C ! head -1 fffffe0005ce6a30 ahci_dump_commands+0x77(fffffe04e7dda740, 0, 80000000) > fffffe04e7dda740::print -t ahci_ctl_t ahcictl_ports[0] ahci_port_t *ahcictl_ports[0] = 0xfffffe04e7de3940 > ::regs ! grep rbx %rbx = 0x000000000000001f %r10 = 0x0000000000000001 > 0xfffffe04e7de3940::print -t ahci_port_t ahciport_slot_pkts[0x1f] sata_pkt_t *ahciport_slot_pkts[0x1f] = 0
Updated by Andy Fiddaman about 5 years ago
- Status changed from New to Pending RTI
- % Done changed from 0 to 100
- Difficulty changed from Medium to Bite-size
- Tags deleted (
needs-triage)
Updated by Electric Monk about 5 years ago
- Status changed from Pending RTI to Closed
git commit 44a84c183ccfba4ca8eb08835c722bd833daf781
commit 44a84c183ccfba4ca8eb08835c722bd833daf781 Author: Andy Fiddaman <omnios@citrus-it.co.uk> Date: 2018-08-29T20:36:52.000Z 9772 Panic in ahci when the failed slot spkt is NULL Reviewed by: Andy Stormont <astormont@racktopsystems.com> Reviewed by: Toomas Soome <tsoome@me.com> Approved by: Robert Mustacchi <rm@joyent.com>