mpt_sas sometimes stalls forever
Opening a new bug for this to keep it separate from bug #1069, which refers to mpt.
Sometimes under load, the mpt_sas driver will stall all IO on any mpt_sas card for an extended duration - often it eventually "wakes up" and begins working again, but sometimes it never wakes up, and the system requires a hard power cycle.
(Note the hour since any messages between the last error from mpt_sas and me screwing up my mdb call in dmesg)
I can log in and poke around, but no matter how long I wait, no IO happens on that controller until the machine is power-cycled.
$ uname -a
SunOS zettabyte 5.11 oi_148 i86pc i386 i86pc
Updated by Albert Lee about 8 years ago
ffffff000c827c40 fffffffffbc2e330 0 0 60 fffffffffbd182e0 PC: _resume_from_idle+0xf1 THREAD: mt_config_thread() stack pointer for thread ffffff000c827c40: ffffff000c8276b0 [ ffffff000c8276b0 _resume_from_idle+0xf1() ] swtch+0x145() cv_wait+0x61() scsi_transport+0x151() scsi_poll+0x7e() mptsas_send_scsi_cmd+0xee() mptsas_inquiry+0xd9() mptsas_get_sata_guid+0x67() mptsas_get_target_device_info+0x131() mptsas_update_hashtab+0xc0() mptsas_config_all+0x89() mptsas_bus_config+0x287() scsi_hba_bus_config+0xdc() devi_config_common+0x94() mt_config_thread+0x53() thread_start+8()
That's presumably after it successfully issues the hba_tran for mptsas... not sure which thread is supposed to wake up the cv.
Updated by Rich Ercolani almost 8 years ago
I discovered this mostly happens when I use drives with 4K sector sizes that are lying/in 512-byte "emulation" mode (and have no mode to change out of it, ACK!) - they drop commands too often, the driver fails to recover, tries resetting the HBA, and we jump off a cliff and never wake up.