Project

General

Profile

Bug #1069

IO hang with bad disk on mpt and mpt_sas

Added by Alasdair Lumsden about 8 years ago. Updated almost 8 years ago.

Status:
New
Priority:
High
Assignee:
-
Category:
kernel
Start date:
2011-05-26
Due date:
% Done:

0%

Estimated time:
Difficulty:
Hard
Tags:
needs-triage

Description

Hi,

On a storage server, I've encountered a failure mode where a single disk has hung with a 100% busy time reported in iostat. As a consequence all IO to the zpool this disk is a member of has come to a complete standstill, with all other disks in the pool reporting 0 activity.

I've witnessed this behaviour on multiple (different) systems before, with SATA disks and SAS disks, on both mpt and mpt_sas. I've spoken to others, including George Wilson who has seen this behaviour too. It's quite a serious issue as this can crop up at any time on a production storage server, and if/when it does, it causes an outage that is only resolved by pulling the disk or rebooting the box.

The storage node consists of:

root ~ (san01.ixlon1): /usr/bin/uname -a
SunOS san01.ixlon1.everycity.co.uk 5.11 oi_148 i86pc i386 i86pc

Supermicro X8DTH-6F Dual Intel E5504, 32GB RAM
6 x LSI 3081E-R mpt based cards
48 x Western Digital SATA disks

"iostat -x 2" output

                 extended device statistics                 
device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b 
...
sd27      0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd28      0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd29      0.0    0.0    0.0    0.0  0.0  4.0    0.0   0 100 
sd30      0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd31      0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
...

All IO to the pool has hung, so commands like "zpool list" stick.

"iostat -E" output shows no drive errors at all:

http://linux01.everycity.co.uk/~alasdair/iostat_e.txt

"*sd_state::walk softstate | ::print -t struct sd_lun" mdb output:

http://linux01.everycity.co.uk/~alasdair/sd_lun_state.txt

No FMA events have been generated:

root ~ (san01.ixlon1): fmdump -eV
TIME CLASS
fmdump: warning: /var/fm/fmd/errlog is empty

sd_io_time has been tuned down to 7 seconds, with 3 retries configured for the Western Digital drives, although several hours have passed so sd_io_time is obviously being ignored:

set sd:sd_io_time=7 (/etc/system)
sd-config-list = "ATA WDC WD7501AALS-0", "retries-timeout:3"; (/kernel/drv/sd.conf)

root ~ (san01.ixlon1): echo "sd_io_time::print" | mdb -k
0x7
root ~ (san01.ixlon1): echo "::walk sd_state | ::grep '.!=0' | ::sd_state" | mdb -k | egrep "^un|un_retry_count|un_cmd_timeout" 
un: ffffff090f43c640
   un_retry_count = 0x3
   un_cmd_timeout = 0x7
un: ffffff09184a5940
   un_retry_count = 0x3
   un_cmd_timeout = 0x7
... (all report 0x3 (or 0x5 for some devices) and 0x7)

The zpool layout consists of:

zpool create -f data \
mirror c1t0d0 c7t0d0 \
mirror c1t1d0 c7t1d0 \
mirror c1t2d0 c7t2d0 \
mirror c1t3d0 c7t3d0 \
mirror c1t4d0 c7t4d0 \
mirror c1t5d0 c7t5d0 \
mirror c1t6d0 c7t6d0 \
mirror c1t7d0 c7t7d0 \
mirror c2t0d0 c8t0d0 \
mirror c2t1d0 c8t1d0 \
mirror c2t2d0 c8t2d0 \
mirror c2t3d0 c8t3d0 \
mirror c2t4d0 c8t4d0 \
mirror c2t5d0 c8t5d0 \
mirror c2t6d0 c8t6d0 \
mirror c2t7d0 c8t7d0 \
mirror c3t8d0 c9t0d0 \
mirror c3t9d0 c9t1d0 \
mirror c3t10d0 c9t2d0 \
mirror c3t11d0 c9t3d0 \
mirror c3t12d0 c9t4d0 \
mirror c3t13d0 c9t5d0 \
mirror c3t14d0 c9t6d0 \
spare c3t15d0 c9t7d0 \
log c3t0d0 \
c3t1d0 \
c3t2d0 \
c3t3d0

Some general mdb investigation:

::walk zio_root | ::zio -r
ADDRESS                                  TYPE  STAGE            WAITER

ffffff0918ca4cd0                         NULL  CHECKSUM_VERIFY  ffffff003da48c40
ffffff0915a27c80                        WRITE VDEV_IO_START    -
ffffff0918eafcc0                        WRITE VDEV_IO_START    -
ffffff0918d09968                        WRITE VDEV_IO_START    -
ffffff0918aadc98                        WRITE VDEV_IO_START    -
ffffff09195eb358                         NULL  OPEN             -
ffffff090dfecc88                         NULL  OPEN             -

ffffff0918ca4cd0::zio
ADDRESS                                  TYPE  STAGE            WAITER          
ffffff0918ca4cd0                         NULL  CHECKSUM_VERIFY  ffffff003da48c40
ffffff0918ca4cd0::zio -r
ADDRESS                                  TYPE  STAGE            WAITER          
ffffff0918ca4cd0                         NULL  CHECKSUM_VERIFY  ffffff003da48c40
ffffff0915a27c80                        WRITE VDEV_IO_START    -
ffffff0918eafcc0                        WRITE VDEV_IO_START    -
ffffff0918d09968                        WRITE VDEV_IO_START    -
ffffff0918aadc98                        WRITE VDEV_IO_START    -

ffffff0918ca4cd0::print -t struct zio
struct zio {
   zbookmark_t io_bookmark = {
       uint64_t zb_objset = 0
       uint64_t zb_object = 0
       int64_t zb_level = 0
       uint64_t zb_blkid = 0
   }
   zio_prop_t io_prop = {
       enum zio_checksum zp_checksum = 0 (ZIO_CHECKSUM_INHERIT)
       enum zio_compress zp_compress = 0 (ZIO_COMPRESS_INHERIT)
       dmu_object_type_t zp_type = 0 (DMU_OT_NONE)
       uint8_t zp_level = 0
       uint8_t zp_copies = 0
       uint8_t zp_dedup = 0
       uint8_t zp_dedup_verify = 0
   }
   zio_type_t io_type = 0 (ZIO_TYPE_NULL)
   enum zio_child io_child_type = 3 (ZIO_CHILD_LOGICAL)
   int io_cmd = 0
   uint8_t io_priority = 0
   uint8_t io_reexecute = 0
   uint8_t [2] io_state = [ 0x1, 0 ]
   uint64_t io_txg = 0
   spa_t *io_spa = 0xffffff090e71a580
   blkptr_t *io_bp = 0
   blkptr_t *io_bp_override = 0
   blkptr_t io_bp_copy = {
       dva_t [3] blk_dva = [
           dva_t {
               uint64_t [2] dva_word = [ 0, 0 ]
           },
           dva_t {
               uint64_t [2] dva_word = [ 0, 0 ]
           },
           dva_t {
               uint64_t [2] dva_word = [ 0, 0 ]
           },
       ]
       uint64_t blk_prop = 0         
       uint64_t [2] blk_pad = [ 0, 0 ]
       uint64_t blk_phys_birth = 0
       uint64_t blk_birth = 0
       uint64_t blk_fill = 0
       zio_cksum_t blk_cksum = {
           uint64_t [4] zc_word = [ 0, 0, 0, 0 ]
       }
   }
   list_t io_parent_list = {
       size_t list_size = 0x30
       size_t list_offset = 0x10
       struct list_node list_head = {
           struct list_node *list_next = 0xffffff0918ca4dc0
           struct list_node *list_prev = 0xffffff0918ca4dc0
       }
   }
   list_t io_child_list = {
       size_t list_size = 0x30
       size_t list_offset = 0x20
       struct list_node list_head = {
           struct list_node *list_next = 0xffffff0918da5788
           struct list_node *list_prev = 0xffffff090e24c380
       }
   }
   zio_link_t *io_walk_link = 0
   zio_t *io_logical = 0
   zio_transform_t *io_transform_stack = 0
   zio_done_func_t *io_ready = 0
   zio_done_func_t *io_done = 0
   void *io_private = 0xffffff003da489e8
   int64_t io_prev_space_delta = 0
   blkptr_t io_bp_orig = {
       dva_t [3] blk_dva = [
           dva_t {
               uint64_t [2] dva_word = [ 0, 0 ]
           },
           dva_t {
               uint64_t [2] dva_word = [ 0, 0 ]
           },
           dva_t {
               uint64_t [2] dva_word = [ 0, 0 ]
           },
       ]
       uint64_t blk_prop = 0
       uint64_t [2] blk_pad = [ 0, 0 ]
       uint64_t blk_phys_birth = 0
       uint64_t blk_birth = 0
       uint64_t blk_fill = 0
       zio_cksum_t blk_cksum = {
           uint64_t [4] zc_word = [ 0, 0, 0, 0 ]
       }
   }
   void *io_data = 0
   void *io_orig_data = 0
   uint64_t io_size = 0
   uint64_t io_orig_size = 0
   vdev_t *io_vd = 0
   void *io_vsd = 0
   const zio_vsd_ops_t *io_vsd_ops = 0
   uint64_t io_offset = 0
   uint64_t io_deadline = 0
   avl_node_t io_offset_node = {
       struct avl_node *[2] avl_child = [ 0, 0 ]
       uintptr_t avl_pcb = 0
   }
   avl_node_t io_deadline_node = {
       struct avl_node *[2] avl_child = [ 0, 0 ]
       uintptr_t avl_pcb = 0
   }
   avl_tree_t *io_vdev_tree = 0
   enum zio_flag io_flags = 0x140 (ZIO_FLAG_{CANFAIL|CONFIG_WRITER})
   enum zio_stage io_stage = 0x80000 (ZIO_STAGE_CHECKSUM_VERIFY)
   enum zio_stage io_pipeline = 0x108000 (ZIO_STAGE_{READY|DONE})
   enum zio_flag io_orig_flags = 0x140 (ZIO_FLAG_{CANFAIL|CONFIG_WRITER})
   enum zio_stage io_orig_stage = 0x1 (ZIO_STAGE_OPEN)
   enum zio_stage io_orig_pipeline = 0x108000 (ZIO_STAGE_{READY|DONE})
   int io_error = 0
   int [4] io_child_error = [ 0, 0, 0, 0 ]
   unsigned long [4][2] io_children = [
       unsigned long [2] [ 0, 0x4 ]
       unsigned long [2] [ 0, 0 ]
       unsigned long [2] [ 0, 0 ]
       unsigned long [2] [ 0, 0 ]
   ]
   uint64_t io_child_count = 0x4
   uint64_t io_parent_count = 0
   uint64_t *io_stall = 0xffffff0918ca4f60
   zio_t *io_gang_leader = 0
   zio_gang_node_t *io_gang_tree = 0
   void *io_executor = 0xffffff003da48c40
   void *io_waiter = 0xffffff003da48c40
   kmutex_t io_lock = {
       void *[1] _opaque = [ 0 ]
   }
   kcondvar_t io_cv = {
       ushort_t _opaque = 0x1
   }
   zio_cksum_report_t *io_cksum_report = 0
   uint64_t io_ena = 0
}

ffffff0918ca4cd0::print -t struct zio io_waiter
void *io_waiter = 0xffffff003da48c40

0xffffff003da48c40::findstack
stack pointer for thread ffffff003da48c40: ffffff003da48920
[ ffffff003da48920 _resume_from_idle+0xf1() ]
 ffffff003da48950 swtch+0x145()
 ffffff003da48980 cv_wait+0x61()
 ffffff003da489c0 zio_wait+0x5d()
 ffffff003da48a40 vdev_uberblock_sync_list+0x163()
 ffffff003da48ad0 vdev_config_sync+0x129()
 ffffff003da48b80 spa_sync+0x5cd()
 ffffff003da48c20 txg_sync_thread+0x247()
 ffffff003da48c30 thread_start+8()

George Wilson and Gordon Ross took a look on the box, here's the discussion between us from #illumos:

http://linux01.everycity.co.uk/~alasdair/illumos_log.txt

I can provide SSH login to this server, it has a public IP, but it will need to be returned to service within 24-48 hours, so anyone would like to take a look, please get in touch on alasdairrr gmail com and I'll set up an account.

Cheers,

Alasdair


Related issues

Related to illumos gate - Bug #1343: mpt_sas sometimes stalls foreverNew2011-08-05

Actions
Related to illumos gate - Bug #1032: mpt_sas driver bottlenecks badly with single misbehaving disk on controllerNew2011-05-14

Actions

History

#1

Updated by George Wilson about 8 years ago

I looked into this some more and I see that mpt_sas doesn't think it has anymore work outstanding:

> ::mptsas
        mptsas_t inst ncmds suspend  power
================================================================================
ffffff0915263000    0     0       0 ON=D0 

But sd still thinks there's outstanding commands:

un: ffffff090f43d2c0
--------------
{
    un_sd = 0xffffff09167c3500
    un_rqs_bp = 0xffffff090fe41580
    un_rqs_pktp = 0xffffff0918ab6478
    un_sense_isbusy = 0
    un_buf_chain_type = 0x1
    un_uscsi_chain_type = 0x8
    un_direct_chain_type = 0x8
    un_priority_chain_type = 0x9
    un_waitq_headp = 0
    un_waitq_tailp = 0
    un_retry_bp = 0
    un_retry_statp = 0
    un_xbuf_attr = 0xffffff090e093800
    un_sys_blocksize = 0x200
    un_tgt_blocksize = 0x200
    un_phy_blocksize = 0x200
    un_blockcount = 0x575466f0
    un_ctype = 0x2
    un_node_type = 0xfffffffff8b46750 "ddi_block:channel" 
    un_interconnect_type = 0x4
    un_notready_retry_count = 0x2
    un_busy_retry_count = 0x5
    un_retry_count = 0x3
    un_victim_retry_count = 0xa
    un_reset_retry_count = 0x2
    un_reserve_release_time = 0x5
    un_reservation_type = 0x1
    un_max_xfer_size = 0x100000
    un_partial_dma_supported = 0x1
    un_buf_breakup_supported = 0
    un_mincdb = 0
    un_maxcdb = 0x3
    un_max_hba_cdb = 0x10
    un_status_len = 0x20
    un_pkt_flags = 0x40000
    un_cmd_timeout = 0x7
    un_uscsi_timeout = 0x7
    un_busy_timeout = 0x1f4
    un_state = 0
    un_last_state = 0
    un_last_pkt_reason = 0
    un_tagflags = 0x4000
    un_resvd_status = 0
    un_detach_count = 0
    un_layer_count = 0
    un_opens_in_progress = 0x1
    un_semoclose = {
        _opaque = [ 0, 0 ]
    }
    un_ncmds_in_driver = 0x5
    un_ncmds_in_transport = 0x5

We may have to enable MPTSAS_DEBUG or run a DEBUG module to get additional information from mpt_sas.

#2

Updated by George Wilson about 8 years ago

This system has both mpt and mpt_sas but the drive in question is using mpt. Looking at mpt tells a very different story:

ffffff0910ee8b00    3     0       0   1024  ON=D0 

                 targ         wwn      ncmds throttle dr_flag  timeout  dups

                 ---------------------------------------------------------------
                    0 83f70202e04e0150   0     MAX   INACTIVE   0/20    0/0
                    1 b69e5857e04e0150   0     MAX   INACTIVE   0/20    0/0
                    2 9742aeace04e0150   0     MAX   INACTIVE   0/20    0/0
                    3 4343aeace04e0150  -1   DRAIN   INACTIVE   0/20    0/0
                    4 1333aeace04e0150   0     MAX   INACTIVE   0/20    0/0
                    5 c8b15857e04e0150   0     MAX   INACTIVE   0/20    0/0
                    6 9e555557e04e0150   0     MAX   INACTIVE   0/20    0/0
                    7 c3980202e04e0150   0     MAX   INACTIVE   0/20    0/0

ffffff0910ee8b00    3     0       0   1024  ON=D0 

                    mpt.  slot               mpt_slots     slot
                 m_ncmds total targ throttle m_t_ncmds targ_tot wq dq
                 ----------------------------------------------------
                       0     0               total  -1    !=  0  6  0

ffffff0919b6d7e0 wait n/a    0      1860          4002 ffffff0919b6d700   3,0  
 [ 0a 00 02 70 02 00 ]
ffffff0918e46140 wait n/a    0      1860          4002 ffffff0918e46060   3,0  
 [ 0a 00 04 70 02 00 ]
ffffff0918e90c60 wait n/a    0      1860          4002 ffffff0918e90b80   3,0  
 [ 2a 00 57 54 22 70 00 00 02 00 ]
ffffff0919b6d370 wait n/a    0      1860          4002 ffffff0919b6d290   3,0  
 [ 2a 00 57 54 24 70 00 00 02 00 ]
ffffff09191685c0 wait n/a    0      1860         14002 ffffff09191684e0   3,0  
 [ 00 00 00 00 00 00 ]
ffffff090e11b158 wait n/a    0      2860           80a ffffff090e11b078   3,0  
 [ a0 00 00 00 00 00 00 00 00 84 00 00 ]

                 WARNING: the total of m_target[].m_t_ncmds does not match the s
                 lots in use

This could be the phy lock problem that has been seen 1068 B3 and earlier versions of the LSI cards.

#3

Updated by Piotr Jasiukajtis about 8 years ago

I see the same issue on a few Sun Fire X4270 running oi_148.
I didn't see this on b130 and earlier though.

#4

Updated by Piotr Jasiukajtis about 8 years ago

Piotr Jasiukajtis wrote:

I see the same issue on a few Sun Fire X4270 running oi_148.
I didn't see this on b130 and earlier though.

pci bus 0x0013 cardnum 0x00 function 0x00: vendor 0x1000 device 0x0058
 LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS
 CardVendor 0x1000 card 0x3150 (LSI Logic / Symbios Logic, Card unknown)
  STATUS    0x0010  COMMAND 0x0147
  CLASS     0x01 0x00 0x00  REVISION 0x04
  BIST      0x00  HEADER 0x00  LATENCY 0x00  CACHE 0x40
  BASE0     0x0000d000 SIZE 256  I/O
  BASE1     0xfaffc000 SIZE 16384  MEM
  BASE3     0xfafe0000 SIZE 65536  MEM
  BASEROM   0x00000000  addr 0x00000000
  MAX_LAT   0x00  MIN_GNT 0x00  INT_PIN 0x01  INT_LINE 0x0a

> ::stacks -m sd
THREAD           STATE    SOBJ                COUNT
ffffff0981914b80 SLEEP    SEMA                    1
                 swtch+0x145
                 sema_p+0x1d9
                 biowait+0x76
                 scsi_uscsi_handle_cmd+0x190
                 sd_ssc_send+0x1fd
                 sd_send_scsi_TEST_UNIT_READY+0x10b
                 sd_ready_and_valid+0x54
                 sdopen+0x2d5
                 dev_open+0x3c
                 spec_open+0x5dc
                 fop_open+0xbf
                 vn_openat+0x6ce
                 copen+0x49e
                 openat32+0x27
                 open32+0x2e
                 _sys_sysenter_post_swapgs+0x149

ffffff0716bac3a0 SLEEP    SEMA                    1
                 swtch+0x145
                 sema_p+0x1d9
                 sdopen+0xe0
                 dev_open+0x3c
                 spec_open+0x5dc
                 fop_open+0xbf
                 vn_openat+0x6ce
                 copen+0x49e
                 openat32+0x27
                 open32+0x2e
                 _sys_sysenter_post_swapgs+0x149

#5

Updated by Piotr Jasiukajtis about 8 years ago

> ::mpt -tsdv
           mpt_t inst mpxio suspend ntargs  power
================================================================================
ffffff06f7f50040    0     0       0   1024  ON=D0 

                 targ         wwn      ncmds throttle dr_flag  timeout  dups
                 ---------------------------------------------------------------
                    0 5000cca00a0552ed  -2     MAX   INACTIVE   0/20    0/0
                      End device: SSP tgt
                    1 5000cca00a055eed   0     MAX   INACTIVE   0/20    0/0
                      End device: SSP tgt
                    2 5000cca00a072105   0     MAX   INACTIVE   0/20    0/0
                      End device: SSP tgt
                    3 5000cca00a033d01   0     MAX   INACTIVE   0/20    0/0
                      End device: SSP tgt
                    4 5000cca00a070e79   0     MAX   INACTIVE   0/20    0/0
                      End device: SSP tgt
                    5 5000cca00a071945   0     MAX   INACTIVE   0/20    0/0
                      End device: SSP tgt
                    6 5000cca00a03d4cd   0     MAX   INACTIVE   0/20    0/0
                      End device: SSP tgt
                    7 5000cca00a07213d   0     MAX   INACTIVE   0/20    0/0
                      End device: SSP tgt
                      /pci@0,0/pci8086,340c@5/pci1000,3150@0/sd@7,0
                    8 5000cca00a072365   0     MAX   INACTIVE   0/20    0/0
                      End device: SSP tgt
                      /pci@0,0/pci8086,340c@5/pci1000,3150@0/sd@8,0
                    9 5000cca00a02ebe9  -1     MAX   INACTIVE   0/20    0/0
                      End device: SSP tgt
                   10 5000cca00a072ed5   0     MAX   INACTIVE   0/20    0/0
                      End device: SSP tgt
                   11 5000cca00a05f5b5   0     MAX   INACTIVE   0/20    0/0
                      End device: SSP tgt
                   12 5000cca00a03bc29   0     MAX   INACTIVE   0/20    0/0
                      End device: SSP tgt
                   13 5000cca00a04c0e1  -1   DRAIN   INACTIVE   0/20    0/0
                      End device: SSP tgt
                   14 5000cca00a09a185   0     MAX   INACTIVE   0/20    0/0
                      End device: SSP tgt
                   15 5000cca00a0701a5  -1     MAX   INACTIVE   0/20    0/0
                      End device: SSP tgt
                   16 500605b00002453d   0     MAX   INACTIVE   0/20    0/0
                      End device: LSI device, SSP tgt, SSP init

                 base_wwid          phys mptid prodid  devid        revid   ssid
                 ---------------------------------------------------------------
                 500605b001a7b320      8    -1 0x2704 0x0058 (1068E) 0x04 0x3150
                 /pci@0,0/pci8086,340c@5/pci1000,3150@0

                 0:SAS (0x9)    1:inactive     2:inactive     3:inactive     
                 4:inactive     5:inactive     6:inactive     7:inactive     

                    mpt.  slot               mpt_slots     slot
                 m_ncmds total targ throttle m_t_ncmds targ_tot wq dq
                 ----------------------------------------------------
                       0     0               total  -5    !=  0  0  0
                 WARNING: the total of m_target[].m_t_ncmds does not match the slots in use
#6

Updated by Piotr Jasiukajtis about 8 years ago

Small update, mpxio is not involved.
I still see this issue with those mpt settings:

# grep -v '^#' /kernel/drv/mpt.conf

ddi-vhci-class="scsi_vhci";
mpxio-disable="yes";

disable-sata-mpxio="yes";

#7

Updated by Rich Ercolani about 8 years ago

My bug, #1032, is a duplicate of this bug.

My system is using 2 LSI 92x1-16[ie] cards, b148, SuperMicro X8DAH-F+. Samsung HD204UI drives with the SMART firmware bug patched.

I've noticed it appears to follow drives around - e.g. if drive X has this problem at time Y, and you remove drive X, reboot the system, zpool replace the drive, and then use drive X again in another pool or to replace a faulted disk in the same pool, drive X is likely to have this recur.

More to the point, all of my drives that have had this behavior exhibited are from the same batch. I would guess this is a case where an error isn't bubbling up properly from said drive, since it appears to be drive-related?

#8

Updated by Rich Ercolani about 8 years ago

I suppose it's not quite identical, as my setup has rather a lot of errors in fmdump -eV, but the observed end result is the same - eventually, I stop getting out errors, and any call into the SAS layer blocks until I physically pull the affected drive that is in an inconsistent state in the kernel's head.

I will instrument it as above when it next recurs.

#9

Updated by Rich Ercolani about 8 years ago

Why hello again bug.

> ::stacks -m sd
THREAD           STATE    SOBJ                COUNT
ffffff02595e1bc0 SLEEP    SEMA                    1
                 swtch+0x145
                 sema_p+0x1d9
                 biowait+0x76
                 scsi_uscsi_handle_cmd+0x190
                 sd_ssc_send+0x1fd
                 sd_send_scsi_TEST_UNIT_READY+0x10b
                 sd_ready_and_valid+0x54
                 sdopen+0x2d5
                 dev_open+0x3c
                 spec_open+0x5dc
                 fop_open+0xbf
                 vn_openat+0x6ce
                 copen+0x49e
                 openat32+0x27
                 open32+0x2e
                 _sys_sysenter_post_swapgs+0x149

http://skysrv.pha.jhu.edu/~rercola/kmdb_dump_mptsas.log.bz2 probably has everything you could possibly want from this.

#11

Updated by Piotr Jasiukajtis almost 8 years ago

I'm happy to announce that backported mpt from b130 (non debug) to oi_148 fixes the issue for me.
I have been running a bunch of patched OpenIndiana hosts for a while.

Also available in: Atom PDF