Bug #8341

deadlock between softmac detach and kstat read

Added by Ryan Zezeski 11 months ago.

Status:NewStart date:2017-06-05
Priority:NormalDue date:
Assignee:Ryan Zezeski% Done:

0%

Category:networking
Target version:-
Difficulty:Medium Tags:needs-triage

Description

The introduction of 8149 opened us up for a deadlock between a concurrent softmac detach and kstat read (of the link module). I described this bug thoroughly in OS-6137 but I'll repeat the description here as well. I first found this bug while testing on a DEBUG kernel but it has also struck us in production (at the end of the boot sequence modunload -i 0 is executed).

8149 made a change to dls_devnet_stat_upate() allowing a concurrent device detach and kstat link module read to deadlock.

First, the stacks indicative of the deadlock:

> ::stacks -m dls
THREAD           STATE    SOBJ                COUNT
ffffff0010e66c40 SLEEP    CV                      1
                 swtch+0x18a
                 cv_wait+0x89
                 kstat_hold+0x4d
                 kstat_hold_bykid+0x37
                 kstat_delete+0x52
                 dls_stat_delete+0x1e
                 dls_devnet_stat_destroy+0x59
                 dls_devnet_unset+0x263
                 dls_devnet_destroy+0x31
                 softmac_destroy+0x422
                 net_predetach+0x3a
                 dacf_op_invoke+0x26e
                 dacf_process_rsrvs+0x10b
                 dacfc_predetach+0x41
                 predetach_node+0x3b
                 i_ndi_unconfig_node+0x1eb
                 i_ddi_detachchild+0x59
                 devi_detach_node+0xac
                 unconfig_immediate_children+0x1b5
                 devi_unconfig_common+0x107
                 mt_config_thread+0x259
                 thread_start+8

ffffff03e3804bc0 SLEEP    CV                      1
                 swtch+0x18a
                 cv_wait+0x89
                 softmac_hold_device+0x203
                 stubs_common_code+0x59
                 dls_devnet_hold_common+0xc5
                 dls_devnet_hold_tmp+0x1a
                 dls_devnet_stat_update+0x2e
                 read_kstat_data+0xd0
                 kstat_ioctl+0x83
                 cdev_ioctl+0x39
                 spec_ioctl+0x60
                 fop_ioctl+0x55
                 ioctl+0x9b
                 _sys_sysenter_post_swapgs+0x241

Thread A (ffffff0010e66c40) is attempting a device detach. It has executed parts of softmac_destroy() and is now stuck in dls_devnet_unset() trying to destroy the kstat. There are two places where the kstat might be destroyed in illumos-joyent. I can determine which one by examining the ddp. (In illumos-gate there is only one call site and here I am confirming that we are stuck on the same one).

> ffffff0010e66c40::findstack -v
stack pointer for thread ffffff0010e66c40: ffffff0010e662e0
[ ffffff0010e662e0 _resume_from_idle+0x112() ]
  ffffff0010e66310 swtch+0x18a()
  ffffff0010e66340 cv_wait+0x89(ffffff14fa7e4e58, fffffffffbc29f60)
  ffffff0010e66380 kstat_hold+0x4d(fffffffffbc39fa0, ffffff0010e66390)
  ffffff0010e664b0 kstat_hold_bykid+0x37(157db, 0)
  ffffff0010e664f0 kstat_delete+0x52(ffffff14fa7e4d90)
  ffffff0010e66520 dls_stat_delete+0x1e(ffffff14fa7e4d90)
  ffffff0010e66550 dls_devnet_stat_destroy+0x59(ffffff030cfe8c48, 0)
  ffffff0010e665b0 dls_devnet_unset+0x263(ffffff030f5b29c8, ffffff0010e6663c, 0)
  ffffff0010e66610 dls_devnet_destroy+0x31(ffffff030f5b29c8, ffffff0010e6663c, 0)
  ffffff0010e66790 softmac_destroy+0x422(ffffff02ff3320a8, ba00000002)
  ffffff0010e667d0 net_predetach+0x3a(ffffff030d42a828, 0, 0)
  ffffff0010e66860 dacf_op_invoke+0x26e(ffffff0300534cf8, ffffff030d42a828, 0)
  ffffff0010e668d0 dacf_process_rsrvs+0x10b(ffffff02ff332208, 2, 1)
  ffffff0010e66910 dacfc_predetach+0x41(ffffff02ff3320a8)
  ffffff0010e66960 predetach_node+0x3b(ffffff02ff3320a8, 2000)
  ffffff0010e669c0 i_ndi_unconfig_node+0x1eb(ffffff02ff3320a8, 4, 2000)
  ffffff0010e669f0 i_ddi_detachchild+0x59(ffffff02ff3320a8, 2000)
  ffffff0010e66a60 devi_detach_node+0xac(ffffff02ff3320a8, 2000)
  ffffff0010e66af0 unconfig_immediate_children+0x1b5(ffffff02ff338610, 0, 2000, ffffffff)
  ffffff0010e66b80 devi_unconfig_common+0x107(ffffff02ff338610, 0, 2000, ffffffff, 0)
  ffffff0010e66c20 mt_config_thread+0x259(ffffff03b1a9c680)
  ffffff0010e66c30 thread_start+8()

> ffffff030cfe8c48::print dls_devnet_t
{
    dd_linkid = 0x2
    dd_linkname = [ "e1000g1" ]
    dd_mac = [ "e1000g1" ]
    dd_ksp = 0xffffff14fa7e4d90
    dd_zone_ksp = 0
    dd_ref = 0
    dd_mutex = {
        _opaque = [ 0 ]
    }
    dd_cv = {
        _opaque = 0
    }
    dd_tref = 0
    dd_flags = 0x1
    dd_owner_zid = 0
    dd_zid = 0
    dd_prop_loaded = 0x1 (B_TRUE)
    dd_prop_taskid = 0
    dd_transient = 0 (0)
}

The dd_zid is zero and dd_linkid is non-zero; therefore, we know the second one was executed.

        if (ddp->dd_linkid != DATALINK_INVALID_LINKID)
                dls_devnet_stat_destroy(ddp, ddp->dd_owner_zid);

Now let's look at Thread B ({{ffffff03e3804bc0}}) and see what it's waiting on. If it's in the first cv_wait case it should be waiting on the smac_global_cv:

        cv_wait(&smac_global_cv, &smac_global_lock);

Is that where Thread B is waiting?

> ffffff03e3804bc0::findstack -v
stack pointer for thread ffffff03e3804bc0: ffffff00131d1710
[ ffffff00131d1710 _resume_from_idle+0x112() ]
  ffffff00131d1740 swtch+0x18a()
  ffffff00131d1770 cv_wait+0x89(ffffff030d350a90, ffffff030d350a88)
  ffffff00131d18f0 softmac_hold_device+0x203(ba00000002, ffffff00131d1958)
  ffffff00131d1940 stubs_common_code+0x59()
  ffffff00131d19b0 dls_devnet_hold_common+0xc5(2, ffffff00131d19f8, 1)
  ffffff00131d19d0 dls_devnet_hold_tmp+0x1a(2, ffffff00131d19f8)
  ffffff00131d1a20 dls_devnet_stat_update+0x2e(ffffff14fa7e4d90, 0)
  ffffff00131d1c20 read_kstat_data+0xd0(ffffff00131d1e48, 8133158, 100001)
  ffffff00131d1c60 kstat_ioctl+0x83(200000000, 4b02, 8133158, 100001, ffffff05ebc389a0, ffffff00131d1e48)
  ffffff00131d1ca0 cdev_ioctl+0x39(200000000, 4b02, 8133158, 100001, ffffff05ebc389a0, ffffff00131d1e48)
  ffffff00131d1cf0 spec_ioctl+0x60(ffffff0324586a00, 4b02, 8133158, 100001, ffffff05ebc389a0, ffffff00131d1e48, 0)
  ffffff00131d1d80 fop_ioctl+0x55(ffffff0324586a00, 4b02, 8133158, 100001, ffffff05ebc389a0, ffffff00131d1e48, 0)
  ffffff00131d1ea0 ioctl+0x9b(5, 4b02, 8133158)
  ffffff00131d1f00 _sys_sysenter_post_swapgs+0x241()

> ffffff030d350a90/P
0xffffff030d350a90:             0x30000012d

No, the address doesn't map to smac_global_cv; it must be the second call.

    while (softmac->smac_state != SOFTMAC_ATTACH_DONE)
        cv_wait(&softmac->smac_cv, &softmac->smac_mutex);

Thread B is waiting for smac_state to be SOFTMAC_ATTACH_DONE. Let's print the softmac state:

> ffffff030d350a90 - 118::print softmac_t
{
    smac_devname = [ "e1000g1" ]
    smac_umajor = 0xba
    smac_uppa = 0x1
    smac_cnt = 0x1
    smac_mutex = {
        _opaque = [ 0 ]
    }
    smac_cv = {
        _opaque = 0x12d
    }
    smac_state = 3 (SOFTMAC_DETACH_INPROG)
    smac_hold_cnt = 0x12d
    smac_flags = 0x1
    smac_attacherr = 0
    smac_mh = 0xffffff030f5b29c8
    smac_softmac = [ 0xffffff030d50a560, 0 ]
    smac_attachok_cnt = 0x1
    smac_attached_left = 0
    smac_notify_thread = 0
    smac_notify_head = 0
    smac_notify_tail = 0
    smac_media = 0x4
    smac_style = 0
    smac_dev = 0
    smac_saplen = 0
    smac_addrlen = 0
    smac_unicst_addr = [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ]
    smac_min_sdu = 0
    smac_max_sdu = 0
    smac_margin = 0
    smac_notifications = 0
    smac_capab_flags = 0
    smac_hcksum_txflags = 0
    smac_no_capability_req = 0 (0)
    smac_mdt_capab = {
        mdt_version = 0
        mdt_flags = 0
        mdt_hdr_head = 0
        mdt_hdr_tail = 0
        mdt_max_pld = 0
        mdt_span_limit = 0
        mdt_mid = {
            mid = [ 0, 0, 0, 0 ]
        }
    }
    smac_mdt = 0 (0)
    smac_lower = 0
    smac_active_mutex = {
        _opaque = [ 0 ]
    }
    smac_active = 0 (0)
    smac_nactive = 0
    smac_fp_mutex = {
        _opaque = [ 0 ]
    }
    smac_fp_cv = {
        _opaque = 0
    }
    smac_fp_disable_clients = 0
    smac_fastpath_admin_disabled = 0 (0)
    smac_sup_list = {
        list_size = 0xe8
        list_offset = 0x30
        list_head = {
            list_next = 0xffffff030d350ba0
            list_prev = 0xffffff030d350ba0
        }
    }
}

The state is SOFTMAC_DETACH_INPROG, but the smac_hold_cnt is non-zero. Obviously, this device is not going to finish detaching until there are no more holds.

Okay, so why are we actually deadlocked? It has to do with kstats.

Thread A has marked the device as SOFTMAC_DETACH_INPROG and then called dls_devnet_unset() to destroy the device -- but it's stuck waiting to hold the kstat so it can delete it.

Thread B holds the kstat lock but is waiting to hold the device (softmac_hold_device()) before it reaches into DLS and grabs a reference to the dls_devnet_t. This call will never return because it's waiting for the softmac to be in an attached state but Thread A has it set to detached and is stuck waiting for the kstat -- deadlock!

This deadlock was introduced by 8149 because it added the call to dls_devnet_hold_tmp() in dls_devnet_stat_update(). I still think 8149 made the correct change, but obviously I'm the first one to test it enough on a debug kernel to find this deadlock.

Also available in: Atom