Project

General

Profile

Bug #3855

I/O to pool appears to be hung - kernel panic

Added by Jeremy Foster over 6 years ago. Updated about 6 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
zfs - Zettabyte File System
Start date:
2013-06-30
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

Experiencing intermittent kernel panics, first couple of panic messages were showing;

panic message: BAD TRAP: type=e (#pf Page fault) rp=ffffff00b9be9970 addr=20 occurred in module "unix" due to a NULL pointer dereference

Latest one now shows;
panic message: I/O to pool 'export' appears to be hung.

'export' volume is our large disk enclosure. System has been running reliably over a year and suddenly has been repeatedly been crashing multiple times over the last two days. No change in hardware and have installed the latest illumos build over original oi151a7 build. Originally I thought it was related to a bad PCI-E HBA card or bad PCI-E bus. Server head has been completely replaced with new hardware, but crashes are still occurring.

History

#1

Updated by Jeremy Foster over 6 years ago

Coredump relating to the panic message: BAD TRAP: type=e (#pf Page fault) rp=ffffff00b9be9970 addr=20 occurred in module "unix" due to a NULL pointer dereference message;

> ::status
debugging crash dump vmcore.0 (64-bit) from bigfs02
operating system: 5.11 oi_151a7-latest-build (i86pc)
image uuid: b5b8986b-665b-435e-d9c0-a117b0804583
panic message: BAD TRAP: type=e (#pf Page fault) rp=ffffff00b9be9970 addr=20 occurred in module "unix" due to a NULL pointer dereference
dump content: kernel pages only
> ::stack
mutex_enter+0xb()
taskq_thread+0x2d0(ffffff19c3e40d78)
thread_start+8()
> ::msgbuf
MESSAGE                                                               
sd42 is /scsi_vhci/disk@g5000c5004d666588
/scsi_vhci/disk@g5000c5004d666588 (sd42) online
/scsi_vhci/disk@g5000c5004d666588 (sd42) multipath status: degraded: path 34 mpt_sas3/disk@w5000c5004d666588,0 is online
sd43 at scsi_vhci0: unit-address g5000c5004d7911d8: f_sym
sd43 is /scsi_vhci/disk@g5000c5004d7911d8
/scsi_vhci/disk@g5000c5004d7911d8 (sd43) online
/scsi_vhci/disk@g5000c5004d7911d8 (sd43) multipath status: degraded: path 35 mpt_sas3/disk@w5000c5004d7911d8,0 is online
sd44 at scsi_vhci0: unit-address g5000c5004d7a7368: f_sym
sd44 is /scsi_vhci/disk@g5000c5004d7a7368
/scsi_vhci/disk@g5000c5004d7a7368 (sd44) online
/scsi_vhci/disk@g5000c5004d7a7368 (sd44) multipath status: degraded: path 36 mpt_sas3/disk@w5000c5004d7a7368,0 is online
sd45 at scsi_vhci0: unit-address g5000c5004d661e49: f_sym
sd45 is /scsi_vhci/disk@g5000c5004d661e49
/scsi_vhci/disk@g5000c5004d661e49 (sd45) online
/scsi_vhci/disk@g5000c5004d661e49 (sd45) multipath status: degraded: path 37 mpt_sas3/disk@w5000c5004d661e49,0 is online
sd46 at scsi_vhci0: unit-address g5000c5004d5cff8a: f_sym
sd46 is /scsi_vhci/disk@g5000c5004d5cff8a
/scsi_vhci/disk@g5000c5004d5cff8a (sd46) online
/scsi_vhci/disk@g5000c5004d5cff8a (sd46) multipath status: degraded: path 38 mpt_sas3/disk@w5000c5004d5cff8a,0 is online
sd47 at scsi_vhci0: unit-address g5000c5004d5cd72b: f_sym
sd47 is /scsi_vhci/disk@g5000c5004d5cd72b
/scsi_vhci/disk@g5000c5004d5cd72b (sd47) online
/scsi_vhci/disk@g5000c5004d5cd72b (sd47) multipath status: degraded: path 39 mpt_sas3/disk@w5000c5004d5cd72b,0 is online
sd48 at scsi_vhci0: unit-address g5000c5004d60010b: f_sym
sd48 is /scsi_vhci/disk@g5000c5004d60010b
/scsi_vhci/disk@g5000c5004d60010b (sd48) online
/scsi_vhci/disk@g5000c5004d60010b (sd48) multipath status: degraded: path 40 mpt_sas3/disk@w5000c5004d60010b,0 is online
sd49 at scsi_vhci0: unit-address g5000c5004d5d2d4b: f_sym
sd49 is /scsi_vhci/disk@g5000c5004d5d2d4b
/scsi_vhci/disk@g5000c5004d5d2d4b (sd49) online
/scsi_vhci/disk@g5000c5004d5d2d4b (sd49) multipath status: degraded: path 41 mpt_sas3/disk@w5000c5004d5d2d4b,0 is online
sd50 at scsi_vhci0: unit-address g5000c5004d6612bb: f_sym
sd50 is /scsi_vhci/disk@g5000c5004d6612bb
/scsi_vhci/disk@g5000c5004d6612bb (sd50) online
/scsi_vhci/disk@g5000c5004d6612bb (sd50) multipath status: degraded: path 42 mpt_sas3/disk@w5000c5004d6612bb,0 is online
sd51 at scsi_vhci0: unit-address g5000c5004d664f1c: f_sym
sd51 is /scsi_vhci/disk@g5000c5004d664f1c
/scsi_vhci/disk@g5000c5004d664f1c (sd51) online
/scsi_vhci/disk@g5000c5004d664f1c (sd51) multipath status: degraded: path 43 mpt_sas3/disk@w5000c5004d664f1c,0 is online
sd52 at scsi_vhci0: unit-address g5000c5004d66139c: f_sym
sd52 is /scsi_vhci/disk@g5000c5004d66139c
/scsi_vhci/disk@g5000c5004d66139c (sd52) online
/scsi_vhci/disk@g5000c5004d66139c (sd52) multipath status: degraded: path 44 mpt_sas3/disk@w5000c5004d66139c,0 is online
ses1 at mpt_sas3: unit-address w5003048001b4117d,0: w5003048001b4117d,0
ses1 is /pci@0,0/pci8086,340e@7/pci1000,3080@0/iport@f0/enclosure@w5003048001b4117d,0
/pci@0,0/pci8086,340e@7/pci1000,3080@0/iport@f0/enclosure@w5003048001b4117d,0 (ses1) online
sd53 at scsi_vhci0: unit-address g5000c5004d660f9f: f_sym
sd53 is /scsi_vhci/disk@g5000c5004d660f9f
/scsi_vhci/disk@g5000c5004d660f9f (sd53) online
/scsi_vhci/disk@g5000c5004d660f9f (sd53) multipath status: degraded: path 45 mpt_sas3/disk@w5000c5004d660f9f,0 is online
mpt_sas1 at mpt_sas0: scsi-iport v0
mpt_sas1 is /pci@0,0/pci8086,340e@7/pci1000,3080@0/iport@v0
/pci@0,0/pci8086,340e@7/pci1000,3080@0/iport@v0 (mpt_sas1) online
device pciclass,030000@3(display#0) keeps up device sd@0,0(disk#0), but the former is not power managed
device pciclass,030000@3(display#0) keeps up device sd@0,0(disk#1), but the former is not power managed
device pciclass,030000@3(display#0) keeps up device sd@0,1(disk#2), but the former is not power managed
device pciclass,030000@3(display#0) keeps up device sd@0,0(sd#3), but the former is not power managed
/pci@0,0/pci1028,236@1a,7/hub@3/device@2/keyboard@0 (hid4) offline
/pci@0,0/pci1028,236@1a,7/hub@3/device@2/mouse@1 (hid5) offline
/pci@0,0/pci1028,236@1a,7/hub@3/device@2/keyboard@0 (hid4) offline
/pci@0,0/pci1028,236@1a,7/hub@3/device@2/mouse@1 (hid5) offline
/pci@0,0/pci1028,236@1a,7/hub@3/device@2 (usb_mid3) removed
/pci@0,0/pci1028,236@1a/device@1/keyboard@0 (hid2) offline
/pci@0,0/pci1028,236@1a/device@1/mouse@1 (hid3) offline
/pci@0,0/pci1028,236@1a/device@1/keyboard@0 (hid2) offline
/pci@0,0/pci1028,236@1a/device@1/mouse@1 (hid3) offline
/pci@0,0/pci1028,236@1a/device@1 (usb_mid2) removed
USB 1.10 device (usb557,2221) operating at low speed (USB 1.x) on USB 1.10 root hub: device@1, usb_mid5 at bus address 3
        ATEN UC-10KM V1.3.124 USB Composite device
usb_mid5 is /pci@0,0/pci1028,236@1d/device@1
/pci@0,0/pci1028,236@1d/device@1 (usb_mid5) online
USB 1.10 interface (usbif557,2221.config1.0) operating at low speed (USB 1.x) on USB 1.10 root hub: keyboard@0, hid8 at bus address 3
        ATEN UC-10KM V1.3.124 USB Composite device
hid8 is /pci@0,0/pci1028,236@1d/device@1/keyboard@0
/pci@0,0/pci1028,236@1d/device@1/keyboard@0 (hid8) online
USB 1.10 interface (usbif557,2221.config1.1) operating at low speed (USB 1.x) on USB 1.10 root hub: mouse@1, hid9 at bus address 3
        ATEN UC-10KM V1.3.124 USB Composite device
hid9 is /pci@0,0/pci1028,236@1d/device@1/mouse@1
/pci@0,0/pci1028,236@1d/device@1/mouse@1 (hid9) online
device pciclass,030000@3(display#0) keeps up device sd@0,0(disk#0), but the former is not power managed
device pciclass,030000@3(display#0) keeps up device sd@0,0(disk#1), but the former is not power managed
device pciclass,030000@3(display#0) keeps up device sd@0,1(disk#2), but the former is not power managed
device pciclass,030000@3(display#0) keeps up device sd@0,0(sd#3), but the former is not power managed
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        Disconnected command timeout for Target 34
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mptsas_handle_event_sync: IOCStatus=0x8000, IOCLogInfo=0x31170000
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        Log info 0x31140000 received for target 34.
        scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        Log info 0x31140000 received for target 34.
        scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        Log info 0x31140000 received for target 34.
        scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        Log info 0x31140000 received for target 34.
        scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mptsas_check_task_mgt: Task 0x3 failed. IOCStatus=0x4a IOCLogInfo=0x0 target=34

WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mptsas_ioc_task_management failed try to reset ioc to recovery!
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mpt0 Firmware version v15.0.0.0 (?)
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mpt0: IOC Operational.

panic[cpu3]/thread=ffffff00b9be9c40: 
BAD TRAP: type=e (#pf Page fault) rp=ffffff00b9be9970 addr=20 occurred in module "unix" due to a NULL pointer dereference

sched: 
#pf Page fault
Bad kernel fault at addr=0x20
pid=0, pc=0xfffffffffb85ecbb, sp=0xffffff00b9be9a68, eflags=0x10246
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de>
cr2: 20
cr3: 3c00000
cr8: c

        rdi:               20 rsi: ffffff19d3260008 rdx: ffffff00b9be9c40
        rcx:                3  r8: ffffff00b9be9c40  r9:     1266c439534c
        rax:                0 rbx:                0 rbp: ffffff00b9be9b60
        r10:                1 r11:                0 r12:                0
        r13:                1 r14:               20 r15:                0
        fsb:                0 gsb: ffffff19a405c080  ds:               4b
         es:               4b  fs:                0  gs:              1c3
        trp:                e err:                2 rip: fffffffffb85ecbb
         cs:               30 rfl:            10246 rsp: ffffff00b9be9a68
hid9 is /pci@0,0/pci1028,236@1d/device@1/mouse@1
/pci@0,0/pci1028,236@1d/device@1/mouse@1 (hid9) online
device pciclass,030000@3(display#0) keeps up device sd@0,0(disk#0), but the former is not power managed
device pciclass,030000@3(display#0) keeps up device sd@0,0(disk#1), but the former is not power managed
device pciclass,030000@3(display#0) keeps up device sd@0,1(disk#2), but the former is not power managed
device pciclass,030000@3(display#0) keeps up device sd@0,0(sd#3), but the former is not power managed
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        Disconnected command timeout for Target 34
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mptsas_handle_event_sync: IOCStatus=0x8000, IOCLogInfo=0x31170000
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        Log info 0x31140000 received for target 34.
        scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        Log info 0x31140000 received for target 34.
        scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        Log info 0x31140000 received for target 34.
        scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        Log info 0x31140000 received for target 34.
        scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mptsas_check_task_mgt: Task 0x3 failed. IOCStatus=0x4a IOCLogInfo=0x0 target=34

WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mptsas_ioc_task_management failed try to reset ioc to recovery!
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mpt0 Firmware version v15.0.0.0 (?)
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mpt0: IOC Operational.

panic[cpu3]/thread=ffffff00b9be9c40: 
BAD TRAP: type=e (#pf Page fault) rp=ffffff00b9be9970 addr=20 occurred in module "unix" due to a NULL pointer dereference

sched: 
#pf Page fault
Bad kernel fault at addr=0x20
pid=0, pc=0xfffffffffb85ecbb, sp=0xffffff00b9be9a68, eflags=0x10246
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de>
cr2: 20
cr3: 3c00000
cr8: c

        rdi:               20 rsi: ffffff19d3260008 rdx: ffffff00b9be9c40
        rcx:                3  r8: ffffff00b9be9c40  r9:     1266c439534c
        rax:                0 rbx:                0 rbp: ffffff00b9be9b60
        r10:                1 r11:                0 r12:                0
        r13:                1 r14:               20 r15:                0
        fsb:                0 gsb: ffffff19a405c080  ds:               4b
         es:               4b  fs:                0  gs:              1c3
        trp:                e err:                2 rip: fffffffffb85ecbb
         cs:               30 rfl:            10246 rsp: ffffff00b9be9a68
         ss:               38         

ffffff00b9be9850 unix:die+df ()
ffffff00b9be9960 unix:trap+db3 ()
ffffff00b9be9970 unix:cmntrap+e6 ()
ffffff00b9be9b60 unix:mutex_enter+b ()
ffffff00b9be9c20 genunix:taskq_thread+2d0 ()
ffffff00b9be9c30 unix:thread_start+8 ()

syncing file systems...
 done
dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
#2

Updated by Jeremy Foster over 6 years ago

Coredump relating to the panic message: I/O to pool 'export' appears to be hung. message;

> ::status
debugging crash dump vmcore.2 (64-bit) from bigfs02
operating system: 5.11 oi_151a7-latest-build (i86pc)
image uuid: 221c92bc-397a-6e7f-a097-dd0eddee092d
panic message: I/O to pool 'export' appears to be hung.
dump content: kernel pages only
> ::status
debugging crash dump vmcore.2 (64-bit) from bigfs02
operating system: 5.11 oi_151a7-latest-build (i86pc)
image uuid: 221c92bc-397a-6e7f-a097-dd0eddee092d
panic message: I/O to pool 'export' appears to be hung.
dump content: kernel pages only
> ::msgbuf
MESSAGE                                                               
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 8
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 9
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 10
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 11
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 12
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 13
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 14
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 15
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 0
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 1
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 2
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 3
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 4
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 5
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 6
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 7
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 8
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 9
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 10
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 11
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 12
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 13
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 14
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 15
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 0
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 1
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 2
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 3
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 4
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 5
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 6
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 7
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 8
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 9
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 10
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 11
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 12
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 13
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 14
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 15
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 0
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 1
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 2
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 3
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 4
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 5
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 6
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 7
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 8
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 9
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 10
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 11
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 12
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 13
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 14
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 15
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 0
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 1
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 2
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 3
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 4
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 5
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 6
pcplusmp: pci14e4,1639 (bnx) instance 0 irq 0x46 vector 0x68 ioapic 0xff intin 0xff is bound to cpu 7
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        Disconnected command timeout for Target 34
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mptsas_handle_event_sync: IOCStatus=0x8000, IOCLogInfo=0x31170000
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        Log info 0x31140000 received for target 34.
        scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        Log info 0x31140000 received for target 34.
        scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        Log info 0x31140000 received for target 34.
        scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        Log info 0x31140000 received for target 34.
        scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mptsas_check_task_mgt: Task 0x3 failed. IOCStatus=0x4a IOCLogInfo=0x0 target=34

WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mptsas_ioc_task_management failed try to reset ioc to recovery!
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mpt0 Firmware version v15.0.0.0 (?)
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mptsas_handle_event: IOCStatus=0x8000, IOCLogInfo=0x31170000
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mptsas_ioc_task_management failed try to reset ioc to recovery!
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mpt0 Firmware version v15.0.0.0 (?)
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mptsas_restart_ioc failed
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mptsas_restart_ioc failed
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        Target 34 reset for command timeout recovery failed!
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        MPT Firmware Fault, code: 1500
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mpt0 Firmware version v15.0.0.0 (?)
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        ioc reset abort passthru
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mptsas_send_sep: passthru SEP Processor Request message error 11
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        Reset failedafter fault was detected
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        MPT Firmware Fault, code: 1500
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mpt0 Firmware version v15.0.0.0 (?)
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        ioc reset abort passthru
WARNING: mptsas_free_devhdl: passthru SAS IO Unit Control error 11
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        Reset failedafter fault was detected
WARNING: /pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        MPT Firmware Fault, code: 1500
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mpt0 Firmware version v15.0.0.0 (?)
/pci@0,0/pci8086,340e@7/pci1000,3080@0 (mpt_sas0):
        mpt0: IOC Operational.
NOTICE: SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major

panic[cpu0]/thread=ffffff00b80c5c40: 
I/O to pool 'export' appears to be hung.

ffffff00b80c5a20 zfs:vdev_deadman+113 ()
ffffff00b80c5a70 zfs:vdev_deadman+4a ()
ffffff00b80c5ac0 zfs:vdev_deadman+4a ()
ffffff00b80c5af0 zfs:spa_deadman+73 ()
ffffff00b80c5b90 genunix:cyclic_softint+f3 ()
ffffff00b80c5ba0 unix:cbe_low_level+14 ()
ffffff00b80c5bf0 unix:av_dispatch_softvect+78 ()
ffffff00b80c5c20 unix:dispatch_softint+39 ()
ffffff00b80059a0 unix:switch_sp_and_call+13 ()
ffffff00b80059e0 unix:dosoftint+44 ()
ffffff00b8005a40 unix:do_interrupt+ba ()
ffffff00b8005a50 unix:cmnint+ba ()
ffffff00b8005bc0 unix:acpi_cpu_cstate+11b ()
ffffff00b8005bf0 unix:cpu_acpi_idle+8d ()
ffffff00b8005c00 unix:cpu_idle_adaptive+13 ()
ffffff00b8005c20 unix:idle+a7 ()
ffffff00b8005c30 unix:thread_start+8 ()

syncing file systems...
 done
dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel

#3

Updated by Jeremy Foster over 6 years ago

  • Status changed from New to Feedback
  • % Done changed from 0 to 100

After additional troubleshooting, there was a bad hard disk that was undetected by the OS. I believe when that specific vdev with the bad drive was being accessed, it would eventually cause the pool to hang up and then eventually kernel panic. Drive has been marked offline and is in the process of resilvering right now with a spare. Server uptime is now over 10 hours which is the most stable its been since the issues started occurring. Will continue to monitor, but I think we can mark this bug as closed for now.

#4

Updated by Yuri Pankov about 6 years ago

  • Status changed from Feedback to Closed

Closing per previous note.

Also available in: Atom PDF