Bug #11952
closedlarge USB hard disks experience I/O failures
100%
Description
Using a large USB hard disk (most likely larger than 2TB) can result in DMA errors and ultimately a stuck device; e.g.,
Nov 9 09:21:36 newcastle rootnex: [ID 561485 kern.warning] WARNING: xhci: coding error detected, the driver is using ddi_dma_attr(9S) incorrectly. There is a small risk of data corruption in particular with large I/Os. The driver should be replaced with a corrected version for proper system operation. To disable this warning, add 'set rootnex:rootnex_bind_warn=0' to /etc/system(4). Nov 9 09:21:36 newcastle xhci: [ID 197104 kern.info] NOTICE: xhci0: failed to bind DMA memory: -3 Nov 9 09:21:36 newcastle xhci: [ID 902155 kern.info] NOTICE: xhci0: xhci stop endpoint command (2)/slot (3) in wrong state: 19 Nov 9 09:21:36 newcastle xhci: [ID 617155 kern.info] NOTICE: xhci0: endpoint is in state 3 Nov 9 09:21:36 newcastle xhci: [ID 902155 kern.info] NOTICE: xhci0: xhci stop endpoint command (3)/slot (3) in wrong state: 19 Nov 9 09:21:36 newcastle xhci: [ID 617155 kern.info] NOTICE: xhci0: endpoint is in state 3 Nov 9 09:21:36 newcastle scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,2064@14/storage@11/disk@0,0 (sd1): Nov 9 09:21:36 newcastle Command failed to complete...Device is gone
This turns out to be somewhat complicated. The device shown above is presenting 512 byte sectors (whether they're emulated or not, under the covers) which means the logical block address (LBA) range spans from 0x0
up to 0x1d1a94a20
; critically, this value is larger than the 32-bit LBA field in a SCSI READ (12)
or WRITE (12)
(aka SCMD_READ_G5
or SCMD_WRITE_G5
) command. Block addresses beyond the 32 bit boundary must be encoded in a command that supports a wider LBA, so sd
moves up to READ (16)
and WRITE (16)
(aka SCMD_READ_G4
and SCMD_WRITE_G4
).
USB host controllers have a relatively small transfer size limit for an individual request. As such, we have logic in scsa2usb_rw_transport()
to break a larger SCSI read or write command into multiple smaller commands in sequence to satisfy the upstack I/O request. Today this logic only covers the Group 1 and Group 5 SCSI read and write commands, leaving out the Group 4 SCSI commands which only come into effect for larger logical addresses. This has only become an issue since the advent of 512 byte sector drives of a capacity larger than about 2TB.
The intermittent nature of the I/O failures (e.g., a pool will import, but heavy I/O will subsequently fail and take the pool offline) seems to be on account of at least two things:
- ZFS issuing I/O requests that fit within the transfer limit at least during import -- but not later
- I/O requests for logical addresses prior to the 2TB boundary appear to be sent as Group 5 commands; only writes to the latter half of a large disk will be Group 4
Updated by Joshua M. Clulow over 2 years ago
Code review: https://code.illumos.org/c/illumos-gate/+/163
Updated by Joshua M. Clulow over 2 years ago
Testing Notes¶
I have a Seagate 4TB expansion drive:
scsa2usb 1 ffffff0cff6eb550 /pci@0,0/pci8086,2064@14/storage@11 usba_device: ffffff0d08997540 idVendor: 0x0bc2 idProduct: 0x231a usb_addr: 0x03 Manufacturer String: Seagate Product String: Expansion
$ pfexec diskinfo | grep -i seagate SCSI c7t0d0 Seagate Expansion 3726.02 GiB no no
Prior to this change, I experienced the failures described in the ticket any time I would try and rsync
any substantial quantity of data into the pool. After applying the fix, I was able to rsync several hundred gigabytes of data into the pool, and have initiated a scrub which has worked so far without error:
scan: scrub in progress since Sat Nov 9 18:03:33 2019 604G scanned at 72.3M/s, 594G issued at 71.1M/s, 604G total 0 repaired, 98.24% done, 0 days 00:02:33 to go
There have been no errors other than the usual illegal requests one expects from a device which isn't a real, full SCSI device:
$ pfexec iostat -En c7t0d0 c7t0d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: Seagate Product: Expansion Revision: 0710 Serial No: NAADVDVB Size: 4000.79GB <4000787029504 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 61 Predictive Failure Analysis: 0
The scrub has now completed:
scan: scrub repaired 0 in 0 days 02:25:24 with 0 errors on Sat Nov 9 20:28:57 2019
Updated by Joshua M. Clulow over 2 years ago
Testing Notes (Supplemental)¶
I did a full RTI build, booted it, and performed another scrub to confirm that the final version of the change is OK:
November 11, 2019 at 04:22:24 PM PST scan: scrub in progress since Mon Nov 11 11:47:13 2019 1.19T scanned at 75.6M/s, 1.19T issued at 75.5M/s, 1.19T total 0 repaired, 99.89% done, 0 days 00:00:17 to go November 11, 2019 at 04:22:54 PM PST scan: scrub repaired 0 in 0 days 04:35:35 with 0 errors on Mon Nov 11 16:22:48 2019
newcastle # uname -v rti-xhci-0-g64d8df1e16
Updated by Electric Monk over 2 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit 10b633f40f61a97f70236c451b22a1ec8368edb2
commit 10b633f40f61a97f70236c451b22a1ec8368edb2 Author: Joshua M. Clulow <josh@sysmgr.org> Date: 2019-11-12T20:36:11.000Z 11952 large USB hard disks experience I/O failures Reviewed by: Paul Winder <paul@winders.demon.co.uk> Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: C Fraire <cfraire@me.com> Approved by: Dan McDonald <danmcd@joyent.com>