Bug #11489

DLS stat delete and aggr kstat can deadlock

Added by Robert Mustacchi 4 months ago.

Start date:
Due date:
% Done:


Estimated time:


A process running kstat against the link module (e.g., kstat -p link:::) can deadlock with a process performing a dladm rename-link operation (or any process calling dls_stat_delete() from inside the MAC perim). These two operations perform the kstat_hold() and MAC perim enter in the opposite order.

Thread A is a node process reading the link module kstats. It's waiting on the MAC perim so it can read one of the MAC stats:

> 0t28505::pid2proc|::walk thread| ::findstack -v
stack pointer for thread ffffd0c8c4ded140: ffffd003d575a750
[ ffffd003d575a750 _resume_from_idle+0x112() ]
  ffffd003d575a780 swtch+0x141()
  ffffd003d575a7c0 cv_wait+0x70(ffffd0c572dbfc04, ffffd0c572dbfbf0)
  ffffd003d575a800 i_mac_perim_enter+0x63(ffffd0c572dbfb18)
  ffffd003d575a830 mac_perim_enter_by_mh+0x23(ffffd0c572dbfb18, ffffd003d575a848
  ffffd003d575a890 aggr_m_stat+0x35(ffffd0c5884201b8, 3e8, ffffd003d575a8a8)
  ffffd003d575a8d0 mac_stat_get+0x41(ffffd0c572dbfb18, 3e8)
  ffffd003d575a900 mac_client_ifspeed+0x24(ffffd0c8b8717660)
  ffffd003d575a920 mac_client_stat_get+0x29d(ffffd0c8b8717660, 3e8)
  ffffd003d575a960 vnic_m_stat+0x75(ffffd0db6359a2d8, 3e8, ffffd003d575a978)
  ffffd003d575a9a0 mac_stat_get+0x41(ffffd0c902e555b0, 3e8)
  ffffd003d575a9e0 dls_stat_update+0x49(ffffd35f205b2260, ffffd10e2c18d058, 0)
  ffffd003d575aa40 dls_devnet_stat_update+0xaf(ffffd35f205b2260, 0)
  ffffd003d575ac30 read_kstat_data+0xd0(ffffd003d575ae58, 97e59a8, 100001)
  ffffd003d575ac70 kstat_ioctl+0x83(200000000, 4b02, 97e59a8, 100001,
  ffffd0c5855fee18, ffffd003d575ae58)
  ffffd003d575acb0 cdev_ioctl+0x39(200000000, 4b02, 97e59a8, 100001,
  ffffd0c5855fee18, ffffd003d575ae58)
  ffffd003d575ad00 spec_ioctl+0x60(ffffd0c579709780, 4b02, 97e59a8, 100001,
  ffffd0c5855fee18, ffffd003d575ae58, 0)
  ffffd003d575ad90 fop_ioctl+0x55(ffffd0c579709780, 4b02, 97e59a8, 100001,
  ffffd0c5855fee18, ffffd003d575ae58, 0)
  ffffd003d575aeb0 ioctl+0x9b(b, 4b02, 97e59a8)
  ffffd003d575af10 _sys_sysenter_post_swapgs+0x153()

Thread A is part of the cmon-agent process:

> 0xffffd0c8c4ded140::print kthread_t t_procp | ::print proc_t p_user.u_psargs
p_user.u_psargs = [ " 
/opt/smartdc/agents/lib/node_modules/cmon-agent/node/bin/node --abort_on_uncaug" ]

The owner of the MAC perim, Thread B, is waiting for a kstat hold:

> ffffd0c572dbfb18::print mac_impl_t mi_perim_owner | ::findstack -v
stack pointer for thread ffffd0c8ba899000: ffffd003d5a7b810
[ ffffd003d5a7b810 _resume_from_idle+0x112() ]
  ffffd003d5a7b840 swtch+0x141()
  ffffd003d5a7b880 cv_wait+0x70(ffffd35f205b2328, fffffffffbc1bfa0)
  ffffd003d5a7b8c0 kstat_hold+0x4d(fffffffffbc2bfe0, ffffd003d5a7b8d0)
  ffffd003d5a7b9f0 kstat_hold_bykid+0x37(1da254, 0)
  ffffd003d5a7ba30 kstat_delete+0x4e(ffffd35f205b2260)
  ffffd003d5a7ba60 dls_stat_delete+0x1e(ffffd35f205b2260)
  ffffd003d5a7ba90 dls_devnet_stat_rename+0x2f(ffffd0c8a3e88d58, 1)
  ffffd003d5a7bb60 dls_devnet_rename+0x364(f1e, 0, ffffd105c40ee2c0, 1)
  ffffd003d5a7bbd0 drv_ioc_rename+0xbb(ffffd105c40ee2b8, 8047784, 100003,
  ffffd0c8bb8ebaf0, ffffd003d5a7be58)
  ffffd003d5a7bc70 drv_ioctl+0x1e4(1200000000, d1d0011, 8047784, 100003,
  ffffd0c8bb8ebaf0, ffffd003d5a7be58)
  ffffd003d5a7bcb0 cdev_ioctl+0x39(1200000000, d1d0011, 8047784, 100003,
  ffffd0c8bb8ebaf0, ffffd003d5a7be58)
  ffffd003d5a7bd00 spec_ioctl+0x60(ffffd0c50ea2d700, d1d0011, 8047784, 100003,
  ffffd0c8bb8ebaf0, ffffd003d5a7be58, 0)
  ffffd003d5a7bd90 fop_ioctl+0x55(ffffd0c50ea2d700, d1d0011, 8047784, 100003,
  ffffd0c8bb8ebaf0, ffffd003d5a7be58, 0)
  ffffd003d5a7beb0 ioctl+0x9b(4, d1d0011, 8047784)
  ffffd003d5a7bf10 _sys_sysenter_post_swapgs+0x153()

Thread B is part of dladm rename-link process. This is performed as part of zone startup on SmartOS.

> ffffd0c572dbfb18::print mac_impl_t mi_perim_owner | ::print kthread_t t_procp | ::print struct proc p_user.u_psargs
p_user.u_psargs = [ " 
dladm rename-link -z d3c62c61-83ad-45fb-b13a-1d673c5d752b tmp937960 net0" ]

Thread B is waiting to get a hold on this kstat:

> ffffd35f205b2328 - c8::print ekstat_t
    e_ks = {
        ks_crtime = 0x260baba253d0fc
        ks_next = 0
        ks_kid = 0x1da254
        ks_module = [ "link" ]
        ks_resv = 0
        ks_instance = 0
        ks_name = [ "tmp937960" ]
        ks_type = 0x1
        ks_class = [ "net" ]
        ks_flags = 0x41
        ks_data = 0xffffd0c8ae0d5280
        ks_ndata = 0x16
        ks_data_size = 0x427
        ks_snaptime = 0x260baba253d0fc
        ks_update = dls_devnet_stat_update
        ks_private = 0xffffd0c8a3e88d58
        ks_snapshot = default_kstat_snapshot
        ks_lock = 0
    e_size = 0x110
    e_owner = 0xffffd0c8c4ded140
    e_cv = {
        _opaque = 0x1
    e_avl_bykid = {
        avl_child = [ 0, 0 ]
        avl_pcb = 0xffffd3f4e523e0d5
    e_avl_byname = {
        avl_child = [ 0, 0 ]
        avl_pcb = 0xffffd0d541f42d09
    e_zone = {
        zoneid = 0
        next = 0

But the e_owner of this kstat is Thread A – which is the same thread waiting for the MAC perim. And the MAC perim is owned by Thread B which is waiting on the kstat. Boom, we have a deadlock.

The kstat library first gets a hold on the kstat and then performs a module callback: dls_devnet_stat_update(). This callback will eventually call into MAC to read the stats: thus it has kstat hold -> MAC perim ordering.

The dls_devnet_rename() function does the opposite. It calls dls_devnet_stat_rename() from inside the MAC perim which eventually calls dls_stat_delete() which tries to grab a hold on the kstat. So it performs MAC perim -> kstat hold ordering.

Also available in: Atom PDF