Project

General

Profile

Actions

Bug #11952

closed

large USB hard disks experience I/O failures

Added by Joshua M. Clulow over 2 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Category:
kernel
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Hard
Tags:
Gerrit CR:

Description

Using a large USB hard disk (most likely larger than 2TB) can result in DMA errors and ultimately a stuck device; e.g.,

Nov  9 09:21:36 newcastle rootnex: [ID 561485 kern.warning] WARNING: xhci: coding error detected, the driver is using ddi_dma_attr(9S) incorrectly. There is a small risk of data corruption in particular with large I/Os. The driver should be replaced with a corrected version for proper system operation. To disable this warning, add 'set rootnex:rootnex_bind_warn=0' to /etc/system(4).
Nov  9 09:21:36 newcastle xhci: [ID 197104 kern.info] NOTICE: xhci0: failed to bind DMA memory: -3
Nov  9 09:21:36 newcastle xhci: [ID 902155 kern.info] NOTICE: xhci0: xhci stop endpoint command (2)/slot (3) in wrong state: 19
Nov  9 09:21:36 newcastle xhci: [ID 617155 kern.info] NOTICE: xhci0: endpoint is in state 3
Nov  9 09:21:36 newcastle xhci: [ID 902155 kern.info] NOTICE: xhci0: xhci stop endpoint command (3)/slot (3) in wrong state: 19
Nov  9 09:21:36 newcastle xhci: [ID 617155 kern.info] NOTICE: xhci0: endpoint is in state 3
Nov  9 09:21:36 newcastle scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,2064@14/storage@11/disk@0,0 (sd1):
Nov  9 09:21:36 newcastle       Command failed to complete...Device is gone

This turns out to be somewhat complicated. The device shown above is presenting 512 byte sectors (whether they're emulated or not, under the covers) which means the logical block address (LBA) range spans from 0x0 up to 0x1d1a94a20; critically, this value is larger than the 32-bit LBA field in a SCSI READ (12) or WRITE (12) (aka SCMD_READ_G5 or SCMD_WRITE_G5) command. Block addresses beyond the 32 bit boundary must be encoded in a command that supports a wider LBA, so sd moves up to READ (16) and WRITE (16) (aka SCMD_READ_G4 and SCMD_WRITE_G4).

USB host controllers have a relatively small transfer size limit for an individual request. As such, we have logic in scsa2usb_rw_transport() to break a larger SCSI read or write command into multiple smaller commands in sequence to satisfy the upstack I/O request. Today this logic only covers the Group 1 and Group 5 SCSI read and write commands, leaving out the Group 4 SCSI commands which only come into effect for larger logical addresses. This has only become an issue since the advent of 512 byte sector drives of a capacity larger than about 2TB.

The intermittent nature of the I/O failures (e.g., a pool will import, but heavy I/O will subsequently fail and take the pool offline) seems to be on account of at least two things:

  • ZFS issuing I/O requests that fit within the transfer limit at least during import -- but not later
  • I/O requests for logical addresses prior to the 2TB boundary appear to be sent as Group 5 commands; only writes to the latter half of a large disk will be Group 4
Actions

Also available in: Atom PDF