Bug #1773
closedLSI 9200 SAS MPT driver forces synchronous writes to device
100%
Description
I have a NexentaStor 3.x system, but was advised in the #illumos IRC group to report this bug here.
I have a server with 3 HBA controllers, one LSI SAS 9211-8i and two LSI SAS 9200-8e. The two 9200 HBAs are set up with MPIO, and have access to all drives in an external chassis. The 9211 HBA is connected to all SAS storage devices internal to the server.
Hard drives connected to the 9211 HBA work as expected, I haven't seen any issues there. Drives connected to the 9200 HBAs appear to be always do all writes as if the device was opened with the O_SYNC flag, even though it isn't. For example, if I do:
dd if=/dev/zero of=/dev/rdsk/<device> bs=4k count=1000
then a drive connected to the 9200 HBAs will get 490kB/s throughput. This equals approx. 122 IOPS for a single 7200 RPM drive. Running the same command on other systems/other hardware with the "oflag=sync" flag will show the same numbers. However, if I connect the same drive to the internal 9211 HBA, the same command gives approx. 70MB/s throughput, so this only happens with drives connected to the 9200 HBAs.
A ZFS file system consisting of 5 drives in a single raidz vdev performs very differently if connected to the 9200 HBAs or the 9211 HBA. When connected to the 9200 HBA, I get:
# dd if=/dev/zero of=datafile bs=1M count=102400 102400+0 records in 102400+0 records out 107374182400 bytes (107 GB) copied, 2372.2 seconds, 45.3 MB/s
By comparison, the same 5 drive raidz zfs file system connected to the 9211 HBA performs an order of magnitude faster:
# dd if=/dev/zero of=datafile bs=1M count=1024000 102400+0 records in 102400+0 records out 1073741824000 bytes (1.1 TB) copied, 2181.94 seconds, 492 MB/s
I have also tried booting Linux on the same hardware. Linux doesn't support ZFS, so I can't test the ZFS file system, but I can benchmark drives individually. When running dd with 4k blocks directly to the drives, as in the first example, I get 120-150MB/s for each drive, so the same slowness doesn't appear on the Linux system. To me, this again indicates a driver issue.
lspci -v -v (run under Nexenta, not Linux) shows the following SAS controllers:
02:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03) Subsystem: LSI Logic / Symbios Logic Unknown device 3020 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 0, Cache Line Size: 256 bytes Interrupt: pin A routed to IRQ 11 Region 0: I/O ports at a000 [disabled] Region 1: Memory at fbb3c000 (64-bit, non-prefetchable) Region 3: Memory at fbb40000 (64-bit, non-prefetchable) Expansion ROM at fbb80000 [disabled] Capabilities: [50] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [68] Express Endpoint IRQ 0 Device: Supported: MaxPayload 4096 bytes, PhantFunc 0, ExtTag+ Device: Latency L0s <64ns, L1 <1us Device: AtnBtn- AtnInd- PwrInd- Device: Errors: Correctable+ Non-Fatal+ Fatal+ Unsupported- Device: RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- Device: MaxPayload 256 bytes, MaxReadReq 512 bytes Link: Supported Speed unknown, Width x8, ASPM L0s, Port 0 Link: Latency L0s <64ns, L1 <1us Link: ASPM Disabled RCB 64 bytes CommClk- ExtSynch- Link: Speed unknown, Width x8 Capabilities: [d0] Vital Product Data Capabilities: [a8] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable+ Address: 00000000fee20000 Data: 0040 Capabilities: [c0] MSI-X: Enable- Mask- TabSize=15 Vector table: BAR=1 offset=00002000 PBA: BAR=1 offset=00003800 03:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03) Subsystem: LSI Logic / Symbios Logic Unknown device 3080 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 0, Cache Line Size: 256 bytes Interrupt: pin A routed to IRQ 11 Region 0: I/O ports at b000 [disabled] Region 1: Memory at fbc3c000 (64-bit, non-prefetchable) Region 3: Memory at fbc40000 (64-bit, non-prefetchable) Subsystem: LSI Logic / Symbios Logic Unknown device 3080 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 0, Cache Line Size: 256 bytes Interrupt: pin A routed to IRQ 11 Region 0: I/O ports at b000 [disabled] Region 1: Memory at fbc3c000 (64-bit, non-prefetchable) Region 3: Memory at fbc40000 (64-bit, non-prefetchable) Expansion ROM at fbc80000 [disabled] Capabilities: [50] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [68] Express Endpoint IRQ 0 Device: Supported: MaxPayload 4096 bytes, PhantFunc 0, ExtTag+ Device: Latency L0s <64ns, L1 <1us Device: AtnBtn- AtnInd- PwrInd- Device: Errors: Correctable+ Non-Fatal+ Fatal+ Unsupported- Device: RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- Device: MaxPayload 256 bytes, MaxReadReq 512 bytes Link: Supported Speed unknown, Width x8, ASPM L0s, Port 0 Link: Latency L0s <64ns, L1 <1us Link: ASPM Disabled RCB 64 bytes CommClk- ExtSynch- Link: Speed unknown, Width x8 Capabilities: [d0] Vital Product Data Capabilities: [a8] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable+ Address: 00000000fee22000 Data: 0043 Capabilities: [c0] MSI-X: Enable- Mask- TabSize=15 Vector table: BAR=1 offset=00002000 PBA: BAR=1 offset=00003800
All measurements and tests considered, I believe I can rule out any hardware issues, and I believe this is a bug in the driver for the SAS controller, or alternatively an issue introduced by MPIO. There was no MPIO in use when I tested with Linux, and there is no MPIO on the 9211 HBA.
Updated by Ketil Froyn almost 12 years ago
I forgot to include:
# uname -a SunOS nexenta 5.11 NexentaOS_134f i86pc i386 i86pc Solaris
Updated by Ketil Froyn almost 12 years ago
- % Done changed from 0 to 100
This has turned out to be an MPIO problem, and was worked around by changing /kernel/drv/scsi_vhci.conf:
load-balance="round-robin";
to
load-balance="logical-block";
Apparently some HBAs, like my LSI HBAs, don't handle round-robin well, and logical-block is a good alternative.
Updated by Dan McDonald over 10 years ago
- Status changed from New to Resolved
Closing this as "resolved" since the filer found a solution to the issue.