Project

General

Profile

Bug #5306

mpt_sas hangs up during IOC reset

Added by Tao Xu over 5 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
driver - device drivers
Start date:
2014-11-10
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

Platform:
LSI SAS-9207 dual-port HBA card, connecting to two expanders with SAS drive attached.
MPxIO enabled for those SAS drives.

Symptom:
When one SAS cable is pulled, HBA driver mpt_sas will hang up sometimes. There's deadlock between mptsas_handle_topo_change() thread and mptsas_flush_hba() thread.

> ::stacks -m mpt_sas
ffffff01ea832c40 SLEEP    CV                      2
                 swtch+0x141
                 cv_wait+0x70
                 ndi_devi_enter+0x7f
                 mptsas_bus_config+0xaf
                 scsi_hba_bus_config+0x70
                 devi_config_common+0xa5
                 ndi_devi_config+0x1a
                 bus_config_phci+0xbb
                 thread_start+8
ffffff01ec1bcc40 SLEEP    CV                      1
                 swtch+0x141
                 cv_wait+0x70
                 ndi_devi_enter+0x7f
                 mptsas_bus_config+0xaf
                 scsi_hba_bus_config+0x70
                 devi_config_common+0xa5
                 mt_config_thread+0x58
                 thread_start+8
ffffff01e97bac40 SLEEP    CV                      1
                 swtch+0x141
                 cv_wait+0x70
                 ndi_devi_enter+0x7f
                 mptsas_handle_topo_change+0x3d5
                 mptsas_handle_dr+0x184
                 taskq_thread+0x2d0
                 thread_start+8
ffffff01eab8ac40 SLEEP    CV                      1
                 swtch+0x141
                 cv_wait+0x70
                 taskq_wait+0x43
                 ddi_taskq_wait+0x11
                 mptsas_taskq_wait+0x23
                 mptsas_flush_hba+0x1dc
                 mptsas_restart_ioc+0x79
                 mptsas_do_passthru+0x588
                 mptsas_smp_start+0x139
                 smp_transport+0x16
                 smp_probe+0x8f
                 mptsas_probe_smp+0x58
                 mptsas_online_smp+0x63
                 mptsas_config_all+0xe6
                 mptsas_bus_config+0x140
                 scsi_hba_bus_config+0x70
                 devi_config_common+0xa5
                 ndi_devi_config+0x1a
                 bus_config_phci+0xbb
                 thread_start+8

Analysis:
Thread A:

> ffffff01e97bac40::findstack -v
stack pointer for thread ffffff01e97bac40: ffffff01e97ba980
[ ffffff01e97ba980 _resume_from_idle+0xf4() ]
  ffffff01e97ba9b0 swtch+0x141()
  ffffff01e97ba9f0 cv_wait+0x70(ffffff42e2d9095c, ffffff42e2d90870)
  ffffff01e97baa30 ndi_devi_enter+0x7f(ffffff42e2d90808, ffffff01e97baaa8)
  ffffff01e97baaf0 mptsas_handle_topo_change+0x3d5(ffffff43da4ae358,
ffffff43848492b8)
  ffffff01e97bab60 mptsas_handle_dr+0x184(ffffff440b0136a8)
  ffffff01e97bac20 taskq_thread+0x2d0(ffffff4384102238)
  ffffff01e97bac30 thread_start+8()

working on m_dr_taskq:
> ffffff01e97bac40::print -t "kthread_t" t_taskq
void *t_taskq = 0xffffff4384102238

being blocked by thread B:
> ffffff42e2d90808::print -t "struct dev_info" devi_busy_thread
void *devi_busy_thread = 0xffffff01eab8ac40

Thread B is also blocked waiting for m_dr_taskq that thread A is working on:
> 0xffffff01eab8ac40::findstack -v
stack pointer for thread ffffff01eab8ac40: ffffff01eab8a2b0
[ ffffff01eab8a2b0 _resume_from_idle+0xf4() ]
  ffffff01eab8a2e0 swtch+0x141()
  ffffff01eab8a320 cv_wait+0x70(ffffff438410226a, ffffff4384102258)
  ffffff01eab8a360 taskq_wait+0x43(ffffff4384102238)
  ffffff01eab8a380 ddi_taskq_wait+0x11(ffffff4384102238)
  ffffff01eab8a3a0 mptsas_taskq_wait+0x23(ffffff4384102238)
  ffffff01eab8a400 mptsas_flush_hba+0x1dc(ffffff4383d10000)
  ffffff01eab8a440 mptsas_restart_ioc+0x79(ffffff4383d10000)
  ffffff01eab8a640 mptsas_do_passthru+0x588(ffffff4383d10000, ffffff01eab8a690,
ffffff01eab8a6c0, ffffff01eab8a770, 20, 1c, ffffff010000003c, ffffffff00000003,
  ffffff01eab8a7e0, ffffff0100000004, ffffff010000003c, 80000000)
  ffffff01eab8a730 mptsas_smp_start+0x139(ffffff01eab8a7b0)
  ffffff01eab8a750 smp_transport+0x16(ffffff01eab8a7b0)
  ffffff01eab8a830 smp_probe+0x8f(ffffff01eab8a850)
  ffffff01eab8a8a0 mptsas_probe_smp+0x58(ffffff4384849010, 500093d001de103f)
  ffffff01eab8a9e0 mptsas_online_smp+0x63(ffffff4384849010, ffffff435d0c8dc8,
ffffff01eab8aa08)
  ffffff01eab8aa50 mptsas_config_all+0xe6(ffffff4384849010)
  ffffff01eab8ab00 mptsas_bus_config+0x140(ffffff4384849010, 4004000, 2,
ffffffff, 0)
  ffffff01eab8ab70 scsi_hba_bus_config+0x70(ffffff4384849010, 4004000, 2,
ffffffff, 0)
  ffffff01eab8abc0 devi_config_common+0xa5(ffffff4384849010, 4004000, ffffffff)
  ffffff01eab8abe0 ndi_devi_config+0x1a(ffffff4384849010, 4004000)
  ffffff01eab8ac20 bus_config_phci+0xbb(ffffff438ffdb008)
  ffffff01eab8ac30 thread_start+8()

Deadlock!

When expander device is removed, DR events will be dispatched and handled by mpt_sas in the order:
SMP device --> disk physical path --> SES pseudo device.

If HBA is in reset (m_in_reset is set), mptsas_handle_dr() will drop the DR events.
https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/io/scsi/adapters/mpt_sas/mptsas.c#L6217
        /*
         * If HBA is being reset, don't perform operations depending
         * on the IOC. We must free the topo list, however.
         */
        if (!mpt->m_in_reset)
            mptsas_handle_topo_change(topo_node, parent);
        else
            NDBG20(("skipping topo change received during reset"));

However, mptsas_handle_topo_change() will temporarily release m_mutex. So others can take the mutex and run into IOC reset (set m_in_reset) in middle of DR event handling.
        mutex_exit(&mpt->m_mutex);

        ndi_devi_enter(scsi_vhci_dip, &circ);
        ndi_devi_enter(parent, &circ1);
        rval = mptsas_offline_target(parent, addr);
        ndi_devi_exit(parent, circ1);
        ndi_devi_exit(scsi_vhci_dip, circ);

    /*
     * Hold the nexus across the bus_config
     */
    ndi_devi_enter(scsi_vhci_dip, &circ);
    ndi_devi_enter(pdip, &circ1);

    /*
     * Drain the taskqs prior to reallocating resources.
     */
    mutex_exit(&mpt->m_mutex);
    ddi_taskq_wait(mpt->m_event_taskq);
    ddi_taskq_wait(mpt->m_dr_taskq);
    mutex_enter(&mpt->m_mutex);

This can lead to the potential deadlock:
  1. Expander device is removed;
  2. SMP removal event is propagated by syseventd while there's still DR events remaining in m_dr_taskq;
  3. SMP sysevent listener (e.g. FMA, prtconf) then queries devinfo tree;
  4. Devinfo finds info cache is invalid and attempts to re-configure all devices;
  5. When configuring phci nodes, scsi_vhci_dip is held in mptsas_bus_config();
  6. The mutex is then held by the configure thread as mptsas_handle_dr() temporarily released it.
  7. The configure thread experienced timeout when probing expander SMP device and proceed to restart IOC. It waits for mptsas_handle_dr() finish with scsi_vhci_dip held.
  8. mptsas_handle_dr() -->> mptsas_handle_topo_change() is waiting for scsi_vhci_dip to offline device in devinfo tree. Thus they run into deadlock.

Summary:
This is a corner case side-effect of #3195, which doesn't check and release scsi_vhci_dip, the root MPxIO node, if IOC reset thread owns it.


Related issues

Related to illumos gate - Bug #3195: mpt_sas IOC reset races can cause panicsResolved2012-09-14

Actions
Is duplicate of illumos gate - Bug #6256: mptsas: deadlock in mptsas_handle_topo_changeClosed2015-09-22

Actions

History

#1

Updated by Marcel Telka over 3 years ago

  • Is duplicate of Bug #6256: mptsas: deadlock in mptsas_handle_topo_change added
#2

Updated by Marcel Telka over 3 years ago

  • Category set to driver - device drivers
  • Status changed from New to Closed

This is a duplicate of #6256.

Also available in: Atom PDF