Project

General

Profile

Bug #1773

LSI 9200 SAS MPT driver forces synchronous writes to device

Added by Ketil Froyn almost 9 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Start date:
2011-11-14
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

I have a NexentaStor 3.x system, but was advised in the #illumos IRC group to report this bug here.

I have a server with 3 HBA controllers, one LSI SAS 9211-8i and two LSI SAS 9200-8e. The two 9200 HBAs are set up with MPIO, and have access to all drives in an external chassis. The 9211 HBA is connected to all SAS storage devices internal to the server.

Hard drives connected to the 9211 HBA work as expected, I haven't seen any issues there. Drives connected to the 9200 HBAs appear to be always do all writes as if the device was opened with the O_SYNC flag, even though it isn't. For example, if I do:

dd if=/dev/zero of=/dev/rdsk/<device> bs=4k count=1000

then a drive connected to the 9200 HBAs will get 490kB/s throughput. This equals approx. 122 IOPS for a single 7200 RPM drive. Running the same command on other systems/other hardware with the "oflag=sync" flag will show the same numbers. However, if I connect the same drive to the internal 9211 HBA, the same command gives approx. 70MB/s throughput, so this only happens with drives connected to the 9200 HBAs.

A ZFS file system consisting of 5 drives in a single raidz vdev performs very differently if connected to the 9200 HBAs or the 9211 HBA. When connected to the 9200 HBA, I get:

# dd if=/dev/zero of=datafile bs=1M count=102400
102400+0 records in
102400+0 records out
107374182400 bytes (107 GB) copied, 2372.2 seconds, 45.3 MB/s

By comparison, the same 5 drive raidz zfs file system connected to the 9211 HBA performs an order of magnitude faster:

# dd if=/dev/zero of=datafile bs=1M count=1024000
102400+0 records in
102400+0 records out
1073741824000 bytes (1.1 TB) copied, 2181.94 seconds, 492 MB/s

I have also tried booting Linux on the same hardware. Linux doesn't support ZFS, so I can't test the ZFS file system, but I can benchmark drives individually. When running dd with 4k blocks directly to the drives, as in the first example, I get 120-150MB/s for each drive, so the same slowness doesn't appear on the Linux system. To me, this again indicates a driver issue.

lspci -v -v (run under Nexenta, not Linux) shows the following SAS controllers:

02:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
        Subsystem: LSI Logic / Symbios Logic Unknown device 3020
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 0, Cache Line Size: 256 bytes
        Interrupt: pin A routed to IRQ 11
        Region 0: I/O ports at a000 [disabled]
        Region 1: Memory at fbb3c000 (64-bit, non-prefetchable)
        Region 3: Memory at fbb40000 (64-bit, non-prefetchable)
        Expansion ROM at fbb80000 [disabled]
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [68] Express Endpoint IRQ 0
                Device: Supported: MaxPayload 4096 bytes, PhantFunc 0, ExtTag+
                Device: Latency L0s <64ns, L1 <1us
                Device: AtnBtn- AtnInd- PwrInd-
                Device: Errors: Correctable+ Non-Fatal+ Fatal+ Unsupported-
                Device: RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
                Device: MaxPayload 256 bytes, MaxReadReq 512 bytes
                Link: Supported Speed unknown, Width x8, ASPM L0s, Port 0
                Link: Latency L0s <64ns, L1 <1us
                Link: ASPM Disabled RCB 64 bytes CommClk- ExtSynch-
                Link: Speed unknown, Width x8
        Capabilities: [d0] Vital Product Data
        Capabilities: [a8] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable+
                Address: 00000000fee20000  Data: 0040
        Capabilities: [c0] MSI-X: Enable- Mask- TabSize=15
                Vector table: BAR=1 offset=00002000
                PBA: BAR=1 offset=00003800

03:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
        Subsystem: LSI Logic / Symbios Logic Unknown device 3080
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 0, Cache Line Size: 256 bytes
        Interrupt: pin A routed to IRQ 11
        Region 0: I/O ports at b000 [disabled]
        Region 1: Memory at fbc3c000 (64-bit, non-prefetchable)
        Region 3: Memory at fbc40000 (64-bit, non-prefetchable)
        Subsystem: LSI Logic / Symbios Logic Unknown device 3080
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 0, Cache Line Size: 256 bytes
        Interrupt: pin A routed to IRQ 11
        Region 0: I/O ports at b000 [disabled]
        Region 1: Memory at fbc3c000 (64-bit, non-prefetchable)
        Region 3: Memory at fbc40000 (64-bit, non-prefetchable)
        Expansion ROM at fbc80000 [disabled]
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [68] Express Endpoint IRQ 0
                Device: Supported: MaxPayload 4096 bytes, PhantFunc 0, ExtTag+
                Device: Latency L0s <64ns, L1 <1us
                Device: AtnBtn- AtnInd- PwrInd-
                Device: Errors: Correctable+ Non-Fatal+ Fatal+ Unsupported-
                Device: RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
                Device: MaxPayload 256 bytes, MaxReadReq 512 bytes
                Link: Supported Speed unknown, Width x8, ASPM L0s, Port 0
                Link: Latency L0s <64ns, L1 <1us
                Link: ASPM Disabled RCB 64 bytes CommClk- ExtSynch-
                Link: Speed unknown, Width x8
        Capabilities: [d0] Vital Product Data
        Capabilities: [a8] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable+
                Address: 00000000fee22000  Data: 0043
        Capabilities: [c0] MSI-X: Enable- Mask- TabSize=15
                Vector table: BAR=1 offset=00002000
                PBA: BAR=1 offset=00003800

All measurements and tests considered, I believe I can rule out any hardware issues, and I believe this is a bug in the driver for the SAS controller, or alternatively an issue introduced by MPIO. There was no MPIO in use when I tested with Linux, and there is no MPIO on the 9211 HBA.

History

#1

Updated by Ketil Froyn almost 9 years ago

I forgot to include:

# uname -a
SunOS nexenta 5.11 NexentaOS_134f i86pc i386 i86pc Solaris
#2

Updated by Ketil Froyn almost 9 years ago

  • % Done changed from 0 to 100

This has turned out to be an MPIO problem, and was worked around by changing /kernel/drv/scsi_vhci.conf:

load-balance="round-robin";

to

load-balance="logical-block";

Apparently some HBAs, like my LSI HBAs, don't handle round-robin well, and logical-block is a good alternative.

#3

Updated by Dan McDonald over 7 years ago

  • Status changed from New to Resolved

Closing this as "resolved" since the filer found a solution to the issue.

Also available in: Atom PDF