Bug #8745
openHipster is marking drives as bad when they are ok
0%
Description
When the logs show an error requesting the inquiry page 0x83 on a target the OS stops looking for any more drives and marks them all as failed (as per "fmadm faulty").
For example, the logs show this error repeating:
scsi: [ID 243001 kern.warning] WARNING: /pci@0,0/pci8086,340e@7/pci1000,3040@0 (mpt_sas0):
mptsas request inquiry page 0x83 for target:1a, lun:0 failed!
But every drive in the enclosure after this drive was flagged as bad as well and marked faulted or unavail in the ZFS pools.
After disabling the physical port 0x1A (26) the other 4 drives after that drive in the enclosure immediately came back online and the pool resilvered.
These are SATA drives in a supermicro enclosure (which uses an expander).
It appears that the OS gets stuck trying to access the failed drive in a loop instead of continuing on to the working drives that are on higher port #s in the enclosure.
I was able to confirm this issue did not occur in 151_a8 or a9 by booting into the snapshots for those versions. In 151_a8 and 151_a9, the drives after the failed drive in the enclosure were all online and working and the unit started to resilver. But upon booting backing to hipster they dropped offline again.
Related issues
Updated by Marcel Telka over 4 years ago
- Project changed from OpenIndiana Distribution to illumos gate
Updated by Marcel Telka over 2 years ago
- Category set to driver - device drivers
- Assignee set to Marcel Telka
I think this might be the exactly same problem as #12163. Please try to update to illumos/hipster with that bug fixed to see whether your problem is still reproducible.
Updated by Marcel Telka over 2 years ago
- Related to Bug #12163: mpt_sas: Collateral damage caused by dead SATA disk added