i/o hang with mega_sas driver
We have a system that has five (5) MegaRAID 8888ELP cards. of the 10 channels on these cards, 9 of them are connected to 4 SATA drives each, for a total of 36 drives. Whenever there is significant i/o, the i/o will hang on the system with the following symptoms:
- iostat -x shows that some of the drives are 100% blocked. When this happens, the number of drives varies. This most recent hang, there were six drives in this state. They are always connected to a single controller, although which controller varies.
- ZFS commands all hang
- The command that was running (in this case a 'cp' commaand) also hangs.
- The box cannot be rebooted using either shutdown or reboot. A power cycle is required
- There is nothing in /var/adm/messages related to disk i/o at the time of the hang, or since the last boot.
I'm attaching the output of "echo '::threadlist -v' | mdb -k" to this bug. Please let me know what other information I can provide.
Updated by Steve Jacobson almost 9 years ago
Here's the output from zpool status:
$ zpool status tank pool: tank state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 c7t0d0 ONLINE 0 0 0 c7t4d0 ONLINE 0 0 0 c8t0d0 ONLINE 0 0 0 c8t4d0 ONLINE 0 0 0 c9t0d0 ONLINE 0 0 0 c10t0d0 ONLINE 0 0 0 c10t4d0 ONLINE 0 0 0 c11t0d0 ONLINE 0 0 0 c11t4d0 ONLINE 0 0 0 c7t7d0 ONLINE 0 0 0 c11t3d0 ONLINE 0 0 0 c8t7d0 ONLINE 0 0 0 raidz2-1 ONLINE 0 0 0 c7t1d0 ONLINE 0 0 0 c7t5d0 ONLINE 0 0 0 c8t1d0 ONLINE 0 0 0 c8t5d0 ONLINE 0 0 0 c9t1d0 ONLINE 0 0 0 c10t1d0 ONLINE 0 0 0 c10t5d0 ONLINE 0 0 0 c11t1d0 ONLINE 0 0 0 c11t5d0 ONLINE 0 0 0 c7t3d0 ONLINE 0 0 0 c8t3d0 ONLINE 0 0 0 c10t7d0 ONLINE 0 0 0 raidz2-2 ONLINE 0 0 0 c7t2d0 ONLINE 0 0 0 c7t6d0 ONLINE 0 0 0 c8t2d0 ONLINE 0 0 0 c8t6d0 ONLINE 0 0 0 c9t2d0 ONLINE 0 0 0 c10t2d0 ONLINE 0 0 0 c10t6d0 ONLINE 0 0 0 c11t2d0 ONLINE 0 0 0 c11t6d0 ONLINE 0 0 0 c9t3d0 ONLINE 0 0 0 c10t3d0 ONLINE 0 0 0 c11t7d0 ONLINE 0 0 0 errors: No known data errors
The zfs get all tank output is:
$ zfs get all tank NAME PROPERTY VALUE SOURCE tank type filesystem - tank creation Wed Dec 15 11:23 2010 - tank used 23.4T - tank available 29.8T - tank referenced 72.0K - tank compressratio 1.00x - tank mounted yes - tank quota none default tank reservation none default tank recordsize 128K default tank mountpoint /tank default tank sharenfs off default tank checksum on default tank compression off default tank atime on default tank devices on default tank exec on default tank setuid on default tank readonly off default tank zoned off default tank snapdir hidden default tank aclinherit restricted default tank canmount on default tank xattr on default tank copies 1 default tank version 5 - tank utf8only off - tank normalization none - tank casesensitivity sensitive - tank vscan off default tank nbmand off default tank sharesmb off default tank refquota none default tank refreservation none default tank primarycache all default tank secondarycache all default tank usedbysnapshots 0 - tank usedbydataset 72.0K - tank usedbychildren 23.4T - tank usedbyrefreservation 0 - tank logbias latency default tank dedup off default tank mlslabel none default tank sync standard default Wed Jan 19 13:41 [root@inp-production-zfs-archive-05 ~]
And the zpool get all tank output:
$ zpool get all tank NAME PROPERTY VALUE SOURCE tank size 65.2T - tank capacity 43% - tank altroot - default tank health ONLINE - tank guid 6753992121983520783 default tank version 28 default tank bootfs - default tank delegation on default tank autoreplace off default tank cachefile - default tank failmode wait default tank listsnapshots off default tank autoexpand off default tank dedupditto 0 default tank dedupratio 1.00x - tank free 37.0T - tank allocated 28.3T - tank readonly off -
Updated by Ken Mays about 8 years ago
- Due date set to 2011-09-17
- Status changed from Feedback to Resolved
- Assignee changed from Albert Lee to Ken Mays
- Target version set to oi_151_stable
- % Done changed from 0 to 100
- Estimated time set to 1.00 h
- Difficulty set to Expert
- Tags set to megaraid
A user reported in openindiana-discuss that they used the older MegaRAID driver from the snv_130 distribution with much success. Need to verify with illumos team.
Updated by Ken Mays over 5 years ago
- Status changed from New to Closed
- Assignee changed from OI illumos to Ken Mays
- % Done changed from 0 to 100
Closing ticket. This issue is more system dependent with the MegaRAID 8888ELP hardware as confirmed with LSI support. Part #: LSI00142 no longer on sale, but we haven't had any major bug reports issued that replicated this issue.