Bug #2998
openAHCI reports many "Illegal Request" with Intel C602 chipset
0%
Description
Observed with both OpenIndiana oi_151a4 and oi_151a5:
With two SATA disks from different vendors attached to 6 GB/s SATA links served by the Intel C602 chipset (SuperMicro X9DAi motherboard), many "Illegal Request" errors are tallied against each drive, but without incrementing "Soft Errors" or any other error count. An analysis of SCSI protocol transactions using scsi.d does not seem to reveal "Illegal Request" errors on the bus or any other error. I assume that there must be some issue inside the ahci driver. Substantial benchmarking does not show any unusual I/O behavior to the disks and there is no noticeable performance issue with the system.
The system did stop being able to talk to the first disk a few days ago ("I/O error") and was successfully replaced with a different model and manufacturer drive. The hardware supplier tested the returned disk and did not find a problem with it.
# cfgadm|grep sata sata0/0::dsk/c3t0d0 disk connected configured ok sata0/1::dsk/c3t1d0 disk connected configured ok sata0/2::dsk/c3t2d0 cd/dvd connected configured ok # kstat -n sd1,err module: sderr instance: 1 name: sd1,err class: device_error Device Not Ready 0 Hard Errors 0 Illegal Request 579 Media Error 0 No Device 0 Predictive Failure Analysis 0 Product ST1000NM0011 Revision Recoverable 0 Revision SN02 Serial No Z1N21SQN Size 1000204886016 Soft Errors 0 Transport Errors 0 Vendor ATA crtime 4743.15789647 snaptime 144971.680727518 # kstat -n sd2,err module: sderr instance: 2 name: sd2,err class: device_error Device Not Ready 0 Hard Errors 0 Illegal Request 594 Media Error 0 No Device 0 Predictive Failure Analysis 0 Product WDC WD5003ABYX-0Revision Recoverable 0 Revision 1S02 Serial No WD-WMAYP3661514 Size 500107862016 Soft Errors 0 Transport Errors 0 Vendor ATA crtime 4743.21090549 snaptime 144974.148133619
Updated by Bob Friesenhahn almost 11 years ago
I am now seeing errors reported by scsi.d and one of the drives is incrementing the illegal request counter with a rate of about 12 per minute. The counts were incrementing before without these reports from scsi.d.
00492.393775171 ahci0:-> 0x00 TEST_UNIT_READY address 01:00, lba 0x000000, len 0x000000, control 0x00 timeout 60 CDBP ffffff22a8faf950 1 zpool(25986) cdb(6) 000000000000 00492.394206426 ahci0:-> 0x00 TEST_UNIT_READY address 01:00, lba 0x000000, len 0x000000, control 0x00 timeout 60 CDBP ffffff241a97c8d0 1 zpool(25986) cdb(6) 000000000000 00492.394551402 ahci0:-> 0x1a MODE_SENSE(6) address 01:00, lba 0x000300, len 0x000024, control 0x00 timeout 60 CDBP ffffff2965543650 1 zpool(25986) cdb(6) 1a0003002400 00492.394851927 ahci0:-> 0x08 READ(6) address 01:00, lba 0x000000, len 0x000001, control 0x00 timeout 60 CDBP ffffff23e2270bd0 1 zpool(25986) cdb(6) 080000000100 00492.395115770 ahci0:-> 0x08 READ(6) address 01:00, lba 0x000000, len 0x000001, control 0x00 timeout 60 CDBP ffffff2242020bd0 1 zpool(25986) cdb(6) 080000000100 00492.395377117 ahci0:-> 0x08 READ(6) address 01:00, lba 0x003ec2, len 0x000001, control 0x00 timeout 60 CDBP ffffff22a3f7c450 1 zpool(25986) cdb(6) 08003ec20100 00492.391249950 ahci0:<- 0x00 TEST_UNIT_READY address 01:00, lba 0x000000, len 0x000000, control 0x00 timeout 60 CDBP ffffff2224a01310, reason 0x0 (COMPLETED) pkt_state 0x17 state 0x0 Success Time 451us 00492.391694829 ahci0:<- 0x00 TEST_UNIT_READY address 01:00, lba 0x000000, len 0x000000, control 0x00 timeout 60 CDBP ffffff22a26d0150, reason 0x0 (COMPLETED) pkt_state 0x17 state 0x0 Success Time 321us 00492.394048858 ahci0:<- 0x00 TEST_UNIT_READY address 01:00, lba 0x000000, len 0x000000, control 0x00 timeout 60 CDBP ffffff22a8faf950, reason 0x0 (COMPLETED) pkt_state 0x17 state 0x0 Success Time 324us 00492.394449965 ahci0:<- 0x00 TEST_UNIT_READY address 01:00, lba 0x000000, len 0x000000, control 0x00 timeout 60 CDBP ffffff241a97c8d0, reason 0x0 (COMPLETED) pkt_state 0x17 state 0x0 Success Time 294us 00492.394685258 ahci0:ILL 0x1a MODE_SENSE(6) address 01:00, lba 0x000300, len 0x000024, control 0x00 timeout 60 CDBP ffffff2965543650 1 sched(0) cdb(6) 1a0003002400 00492.394763416 ahci0:<- 0x1a MODE_SENSE(6) address 01:00, lba 0x000300, len 0x000024, control 0x00 timeout 60 CDBP ffffff2965543650, reason 0x0 (COMPLETED) pkt_state 0x37 state 0x2 Check_Condition Time 258us 00492.392047411 ahci0:ILL 0x1a MODE_SENSE(6) address 01:00, lba 0x000300, len 0x000024, control 0x00 timeout 60 CDBP ffffff22a65e7310 1 sched(0) cdb(6) 1a0003002400 00492.392187885 ahci0:<- 0x1a MODE_SENSE(6) address 01:00, lba 0x000300, len 0x000024, control 0x00 timeout 60 CDBP ffffff22a65e7310, reason 0x0 (COMPLETED) pkt_state 0x37 state 0x2 Check_Condition Time 367us
Updated by Bob Friesenhahn almost 11 years ago
The system is still up and zfs pool is still working ok. No soft/hard/transport errors have been reported. The behavior with the second drive (12 illegal request counts per minute) still persists while the count increment has stopped entirely on the first drive:
sd1 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: ATA Product: ST1000NM0011 Revision: SN02 Serial No: Z1N21SQN Size: 1000.20GB <1000204886016 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 637 Predictive Failure Analysis: 0 sd2 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: ATA Product: WDC WD5003ABYX-0 Revision: 1S02 Serial No: WD-WMAYP3661514 Size: 500.11GB <500107862016 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 137353 Predictive Failure Analysis: 0
Richard Elling tells me that he has a system which is exhibiting a similar problem.
Updated by Bob Friesenhahn almost 11 years ago
Another status update to observed behavior since behavior has changed again. As of sometime yesterday, the illegal request count on sd2 stopped incrementing. It had been incrementing at 12 counts per minute. Now neither drive is actively counting illegal request errors.
% uptime 08:09am up 18 days 9:14, 10 users, load average: 0.05, 0.05, 0.05 sd1 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: ATA Product: ST1000NM0011 Revision: SN02 Serial No: Z1N21SQN Size: 1000.20GB <1000204886016 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 637 Predictive Failure Analysis: 0 sd2 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: ATA Product: WDC WD5003ABYX-0 Revision: 1S02 Serial No: WD-WMAYP3661514 Size: 500.11GB <500107862016 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 200849 Predictive Failure Analysis: 0
Updated by Bob Friesenhahn over 10 years ago
As of a about a week ago, the system is no longer producing large numbers of illegal request errors. The two SATA drives have produced only 3 and 4 illegal requests, respectively, in two days of uptime. Perhaps the improvement is due to OpenIndiana updates.
Updated by Bob Friesenhahn over 10 years ago
With OpenIndiana oi_151a7 I am only seeing a tiny number of illegal requests (<= 10) so I am closing this report.
Updated by Bob Friesenhahn over 10 years ago
- Status changed from New to In Progress
- Assignee set to Bob Friesenhahn
Updated by Bob Friesenhahn over 10 years ago
- Status changed from In Progress to Feedback
- Assignee deleted (
Bob Friesenhahn)