Project

General

Profile

Bug #11039

All zfs/nfs/smb threads in door calls to idle idmap

Added by Gordon Ross over 1 year ago. Updated 10 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Start date:
2019-05-15
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

Customer crash dump with MANY threads stuck in door calls to idmap.

Most of NFS threads looked like this:

> ::stacks -m nfs
THREAD           STATE    SOBJ                COUNT
ffffd0637778bba0 SLEEP    MUTEX                 378
                 swtch+0x141
                 turnstile_block+0x21a
                 mutex_vector_enter+0x3a3
                 zfs_zaccess_aces_check+0x62
                 zfs_zaccess_common+0xd8
                 zfs_zaccess+0xeb
                 zfs_zaccess_rwx+0x45
                 zfs_access+0xde
                 fop_access+0x54
                 rfs3_access+0x3d4
                 common_dispatch+0x77a
                 rfs_dispatch+0x2d
                 svc_getreq+0x1c1
                 svc_run+0xe0
                 svc_do_run+0x8e
                 nfssys+0x111
                 _sys_sysenter_post_swapgs+0x149

ffffd063c1e687c0 SLEEP    MUTEX                 229
                 swtch+0x141
                 turnstile_block+0x21a
                 mutex_vector_enter+0x3a3
                 zfs_zaccess_aces_check+0x62
                 zfs_zaccess_common+0xd8
                 zfs_zaccess+0xeb
                 zfs_getattr+0x166
                 fop_getattr+0xa8
                 rfs3_access+0xd1
                 common_dispatch+0x77a
                 rfs_dispatch+0x2d
                 svc_getreq+0x1c1
                 svc_run+0xe0
                 svc_do_run+0x8e
                 nfssys+0x111
                 _sys_sysenter_post_swapgs+0x149

And they were stuck for long:

> ffffd0637778bba0::thread -d
            ADDR         DISPTIME BOUND PR SWITCH
ffffd0637778bba0          6cab617    -1  0 t-8413902

Overall, they're all stuck on 5 mutexes, here's the count:

the mutexes:

> ::stacks -m nfs |::findstack -v !grep mutex_vector_enter | awk '{print $2}' | sort | uniq -c
198 mutex_vector_enter+0x3a3(ffffd063c40c35c8)
174 mutex_vector_enter+0x3a3(ffffd063c40ff0e8)
426 mutex_vector_enter+0x3a3(ffffd063c48953d0)
145 mutex_vector_enter+0x3a3(ffffd063e6b9e0e8)
  76 mutex_vector_enter+0x3a3(ffffd0ea2e076d70)

All the mutexes are being held by threads waiting in door calls but what's interesting those owners were running fairly recently:

> ffffd063c40ff0e8::mutex
            ADDR  TYPE             HELD MINSPL OLDSPL WAITERS
ffffd063c40ff0e8 adapt ffffd063777c93c0      -      -     yes
> ffffd063777c93c0::findstack -v
stack pointer for thread ffffd063777c93c0: ffffd001ef5b4ea0
[ ffffd001ef5b4ea0 _resume_from_idle+0x112() ]
  ffffd001ef5b4ed0 swtch+0x141()
  ffffd001ef5b4f10 cv_wait+0x70(ffffd063777c95ae, ffffd063777c95b0)
  ffffd001ef5b4f70 delay_common+0xb8(64)
  ffffd001ef5b4fb0 delay+0x30(64)
  ffffd001ef5b5010 kidmap_call_door+0x1ea(ffffd063758819c0, ffffd001ef5b5100)
  ffffd001ef5b51c0 kidmap_rpc_call+0x213(ffffd063758819c0, 1, fffffffff7940ff0, ffffd001ef5b52a0,
  fffffffff7940eb0, ffffd001ef5b5280)
  ffffd001ef5b5300 kidmap_getgidbysid+0x102(fffffffffbcf3f00, ffffd063be112c58, 220, ffffd001ef5b532c)
  ffffd001ef5b5360 zfs_fuid_map_id+0x99(ffffd063ad333000, 100000220, ffffd064612a9340, 1)
  ffffd001ef5b53e0 zfs_groupmember+0x12c(ffffd063ad333000, 100000220, ffffd064612a9340)
  ffffd001ef5b54a0 zfs_zaccess_aces_check+0x267(ffffd063c40ff020, ffffd001ef5b556c, 0, ffffd064612a9340)
  ffffd001ef5b5520 zfs_zaccess_common+0xd8(ffffd063c40ff020, 80, ffffd001ef5b556c, ffffd001ef5b5568, 0,
  ffffd064612a9340)
  ffffd001ef5b55d0 zfs_zaccess+0xeb(ffffd063c40ff020, 80, 0, 0, ffffd064612a9340)
  ffffd001ef5b56e0 zfs_getattr+0x166(ffffd063c40ea980, ffffd001ef5b57b0, 0, ffffd064612a9340, 0)
  ffffd001ef5b5760 fop_getattr+0xa8(ffffd063c40ea980, ffffd001ef5b57b0, 0, ffffd064612a9340, 0)
  ffffd001ef5b58a0 rfs3_access+0xd1(ffffd001ef5b5ac0, ffffd001ef5b5980, ffffd0639ec390c0, ffffd001ef5b5ca0,
  ffffd064612a9340, 0)
  ffffd001ef5b5c20 common_dispatch+0x77a(ffffd001ef5b5ca0, ffffd06461433a00, 2, 4, fffffffff87495c6,
  ffffffffc0314060)
  ffffd001ef5b5c40 rfs_dispatch+0x2d(ffffd001ef5b5ca0, ffffd06461433a00)
  ffffd001ef5b5d20 svc_getreq+0x1c1(ffffd06461433a00, ffffd0647a086bc0)
  ffffd001ef5b5d90 svc_run+0xe0(ffffd063837613d0)
  ffffd001ef5b5dd0 svc_do_run+0x8e(1)
  ffffd001ef5b5ec0 nfssys+0x111(e, fe7f0fbc)
  ffffd001ef5b5f10 _sys_sysenter_post_swapgs+0x149()
> ffffd063777c93c0::thread -d

            ADDR         DISPTIME BOUND PR SWITCH
ffffd063777c93c0          74b18cb    -1  0 t-26 <--- less than a second ago

But if we jumt to zfs threads only, there are more mutexes being held and the owners ultimately are being stuck for longer:

> ::stacks -m zfs |::findstack -v !grep mutex_vector_enter | awk '{print $2}' | sort | uniq -c
  20 mutex_vector_enter+0x3a3(ffffd063c0915c68)
  21 mutex_vector_enter+0x3a3(ffffd063c092fd90)
  12 mutex_vector_enter+0x3a3(ffffd063c0b907a0)
  40 mutex_vector_enter+0x3a3(ffffd063c40c30f0)
200 mutex_vector_enter+0x3a3(ffffd063c40c35c8) <— all nfs threads + 2 other
174 mutex_vector_enter+0x3a3(ffffd063c40ff0e8) <— all nfs
  14 mutex_vector_enter+0x3a3(ffffd063c40ffe78)
139 mutex_vector_enter+0x3a3(ffffd063c48613c8)
426 mutex_vector_enter+0x3a3(ffffd063c48953d0) <— all are nfs
   2 mutex_vector_enter+0x3a3(ffffd063c4895a98)
   1 mutex_vector_enter+0x3a3(ffffd063c497d2e0)
   3 mutex_vector_enter+0x3a3(ffffd063c497daa0)
160 mutex_vector_enter+0x3a3(ffffd063c4a846b0)
168 mutex_vector_enter+0x3a3(ffffd063c4a84e70)
  88 mutex_vector_enter+0x3a3(ffffd063d357a7a0)
   4 mutex_vector_enter+0x3a3(ffffd063e026de68)
  68 mutex_vector_enter+0x3a3(ffffd063e49802e0) <— this one is held by a thread that last switched very long ago
145 mutex_vector_enter+0x3a3(ffffd063e6b9e0e8) <— all are nfs
   1 mutex_vector_enter+0x3a3(ffffd0e916e8dc80)
  76 mutex_vector_enter+0x3a3(ffffd0ea2e076d70) <— all are nfs

> ffffd063e49802e0::mutex
            ADDR  TYPE             HELD MINSPL OLDSPL WAITERS
ffffd063e49802e0 adapt ffffd001eebf5c40      -      -     yes
> ffffd001eebf5c40::thread -d
            ADDR         DISPTIME BOUND PR SWITCH
ffffd001eebf5c40          6c8bca8    -1  0 t-8543293
> ffffd001eebf5c40::findstack -v
stack pointer for thread ffffd001eebf5c40: ffffd001eebf46a0
[ ffffd001eebf46a0 _resume_from_idle+0x112() ]
  ffffd001eebf46e0 swtch_to+0xb6(ffffd063c4dec760)
  ffffd001eebf4730 shuttle_resume+0x2af(ffffd063c4dec760, fffffffffbd17130)
  ffffd001eebf47e0 door_upcall+0x212(ffffd063e9e23980, ffffd001eebf49b0, 0, ffffffffffffffff, 0)
  ffffd001eebf4860 door_ki_upcall_limited+0x67(ffffd0637fb31f60, ffffd001eebf49b0, 0, ffffffffffffffff, 0)
  ffffd001eebf48c0 kidmap_call_door+0xb3(ffffd063758819c0, ffffd001eebf49b0)
  ffffd001eebf4a70 kidmap_rpc_call+0x213(ffffd063758819c0, 1, fffffffff7940ff0, ffffd001eebf4b50,
  fffffffff7940eb0, ffffd001eebf4b30)
  ffffd001eebf4bb0 kidmap_getuidbysid+0x102(fffffffffbcf3f00, ffffd063bd2e9718, 8401, ffffd001eebf4bdc)
  ffffd001eebf4c10 zfs_fuid_map_id+0x70(ffffd0639c314800, 200008401, ffffd062ddd36db0, 2)
  ffffd001eebf4c90 zfs_user_in_cred+0x5c(ffffd0639c314800, 200008401, ffffd062ddd36db0)
  ffffd001eebf4d50 zfs_zaccess_aces_check+0x133(ffffd063e4980218, ffffd001eebf4e1c, 0, ffffd062ddd36db0)
  ffffd001eebf4dd0 zfs_zaccess_common+0xd8(ffffd063e4980218, 80, ffffd001eebf4e1c, ffffd001eebf4e18, 0,
  ffffd062ddd36db0)
  ffffd001eebf4e80 zfs_zaccess+0xeb(ffffd063e4980218, 80, 0, 0, ffffd062ddd36db0)
  ffffd001eebf4f90 zfs_getattr+0x166(ffffd09b95f66340, ffffd001eebf50e0, 0, ffffd062ddd36db0, ffffffffc0301fb0
  )
  ffffd001eebf5010 fop_getattr+0xa8(ffffd09b95f66340, ffffd001eebf50e0, 0, ffffd062ddd36db0, ffffffffc0301fb0
  )
  ffffd001eebf5250 smb_vop_getattr+0xc2(ffffd09b95f66340, 0, ffffd001eebf5490, 0, ffffd062ddd36db0)
  ffffd001eebf5440 smb_vop_lookup+0x19b(ffffd063c40d3a00, ffffd071dc20f3c0, ffffd001eebf5540, ffffd06f3ac0e780
  , 18, ffffd001eebf555c, ffffd063c40d3a00, ffffd001eebf5490, ffffd06460f42118)
  ffffd001eebf55c0 smb_fsop_lookup+0x15c(ffffd09b5143e2e0, ffffd06460f42118, 10, ffffd063c2d1d688,
  ffffd063c2d1d688, ffffd071dc20f3c0, ffffd001eebf5698)
  ffffd001eebf56f0 smb_odir_wildcard_fileinfo+0x70(ffffd09b5143e2e0, ffffd063d200b2a8, ffffd071dc20f3c0,
  ffffd001eebf58e8)
  ffffd001eebf5780 smb_odir_read_fileinfo+0x13e(ffffd09b5143e2e0, ffffd063d200b2a8, ffffd001eebf58e8,
  ffffd001eebf58d8)
  ffffd001eebf5830 smb2_find_entries+0x11a(ffffd09b5143e2e0, ffffd063d200b2a8, ffffd001eebf58d0)
  ffffd001eebf5ac0 smb2_query_dir+0x3f5(ffffd09b5143e2e0)
  ffffd001eebf5b50 smb2sr_work+0x352(ffffd09b5143e2e0)
  ffffd001eebf5b90 smb2_tq_work+0x73(ffffd09b5143e2e0)
  ffffd001eebf5c20 taskq_d_thread+0xb7(ffffd0ea3ad38d28)
  ffffd001eebf5c30 thread_start+8()

This one is particularly interesting. It is stuck in door_get_server()

> ffffd063c0915c68::mutex
            ADDR  TYPE             HELD MINSPL OLDSPL WAITERS
ffffd063c0915c68 adapt ffffd001f10d6c40      -      -     yes
> ffffd001f10d6c40::findstack -v
stack pointer for thread ffffd001f10d6c40: ffffd001f10d5760
[ ffffd001f10d5760 _resume_from_idle+0x112() ]
  ffffd001f10d5790 swtch+0x141()
  ffffd001f10d57d0 cv_wait+0x70(ffffd0632f8354d0, fffffffffbd17130)
  ffffd001f10d5860 cv_wait_sig_swap_core+0x8a(ffffd0632f8354d0, fffffffffbd17130, ffffd001f10d588c)
  ffffd001f10d58d0 door_get_server+0xc1(ffffd070f9a86f38)
  ffffd001f10d5980 door_upcall+0x166(ffffd063e9e23980, ffffd001f10d5b50, 0, ffffffffffffffff, 0)
  ffffd001f10d5a00 door_ki_upcall_limited+0x67(ffffd0637fb31f60, ffffd001f10d5b50, 0, ffffffffffffffff, 0)
  ffffd001f10d5a60 kidmap_call_door+0xb3(ffffd063758819c0, ffffd001f10d5b50)
  ffffd001f10d5c10 kidmap_rpc_call+0x213(ffffd063758819c0, 1, fffffffff7940ff0, ffffd001f10d5cf0,
  fffffffff7940eb0, ffffd001f10d5cd0)
  ffffd001f10d5d50 kidmap_getgidbysid+0x102(fffffffffbcf3f00, ffffd063be112c58, 220, ffffd001f10d5d7c)
  ffffd001f10d5db0 zfs_fuid_map_id+0x99(ffffd063bdba7800, 100000220, ffffd062ddd36db0, 1)
  ffffd001f10d5e30 zfs_groupmember+0x12c(ffffd063bdba7800, 100000220, ffffd062ddd36db0)
  ffffd001f10d5ef0 zfs_zaccess_aces_check+0x267(ffffd063c0915ba0, ffffd001f10d5fbc, 0, ffffd062ddd36db0)
  ffffd001f10d5f70 zfs_zaccess_common+0xd8(ffffd063c0915ba0, 20, ffffd001f10d5fbc, ffffd001f10d5fb8, 0,
  ffffd062ddd36db0)
  ffffd001f10d6020 zfs_zaccess+0xeb(ffffd063c0915ba0, 20, 0, 0, ffffd062ddd36db0)
  ffffd001f10d60c0 zfs_lookup+0x354(ffffd063c091c940, ffffd06ef7740a38, ffffd001f10d6460, 0, 80000, 0,
  ffffd062ddd36db0, ffffffffc0301fb0, ffffd001f10d647c, ffffd001f10d61d0)
  ffffd001f10d6170 fop_lookup+0xa2(ffffd063c091c940, ffffd06ef7740a38, ffffd001f10d6460, 0, 80000, 0,
  ffffd062ddd36db0, ffffffffc0301fb0, ffffd001f10d647c, ffffd001f10d61d0)
  ffffd001f10d6360 smb_vop_lookup+0x110(ffffd063c091c940, ffffd06ef7740a38, ffffd001f10d6460, ffffd0701c0c1380
  , 3, ffffd001f10d647c, ffffd063c091c940, ffffd001f10d63b0, ffffd062ddd36db0)
  ffffd001f10d64e0 smb_fsop_lookup+0x15c(ffffd06ef77406f0, ffffd062ddd36db0, 3, ffffd063cb2e5690,
  ffffd063cb2e5690, ffffd06ef7740a38, ffffd06ef7740980)
  ffffd001f10d65a0 smb_fsop_lookup_name+0x274(ffffd06ef77406f0, ffffd062ddd36db0, 1, ffffd063cb2e5690,
  ffffd063cb2e5690, ffffd06ef7740a38, ffffd06ef7740980)
  ffffd001f10d66f0 smb_open_subr+0x238(ffffd06ef77406f0)
  ffffd001f10d6750 smb_common_open+0x59(ffffd06ef77406f0)
  ffffd001f10d6ac0 smb2_create+0x422(ffffd06ef77406f0)
  ffffd001f10d6b50 smb2sr_work+0x352(ffffd06ef77406f0)
  ffffd001f10d6b90 smb2_tq_work+0x73(ffffd06ef77406f0)
  ffffd001f10d6c20 taskq_d_thread+0xb7(ffffd06583239530)
  ffffd001f10d6c30 thread_start+8()
> ffffd001f10d6c40::thread -d
            ADDR         DISPTIME BOUND PR SWITCH
ffffd001f10d6c40          6cab1e1    -1  0 t-8414980

The door leads to idmapd:

> ffffd063e9e23980::print vnode_t v_data |::print door_node_t door_target
door_target = 0xffffd0632f835068

> ::ps -f !grep ffffd0632f835068
R  27179      1  27179  27179      1 0x42000000 ffffd0632f835068 /usr/lib/idmapd
But idmapd is doing nothing:

> 0xffffd0632f835068::walk thread |::stacks
THREAD           STATE    SOBJ                COUNT
ffffd063777b4500 SLEEP    CV                     42
                 swtch+0x141
                 cv_wait_sig_swap_core+0x1b9
                 cv_wait_sig_swap+0x17
                 cv_waituntil_sig+0xbd
                 lwp_park+0x15e
                 syslwp_park+0x63
                 _sys_sysenter_post_swapgs+0x149

ffffd063cced98a0 SLEEP    CV                      1
                 swtch+0x141
                 cv_timedwait_sig_hires+0x39d
                 cv_waituntil_sig+0xfa
                 lwp_park+0x15e
                 syslwp_park+0x63
                 _sys_sysenter_post_swapgs+0x149

ffffd0726810d880 SLEEP    CV                      1
                 swtch+0x141
                 cv_wait_sig_swap_core+0x1b9
                 cv_wait_sig_swap+0x17
                 cv_timedwait_sig_hrtime+0x35
                 poll_common+0x554
                 pollsys+0xe7
                 _sys_sysenter_post_swapgs+0x149

But, despite the fact all threads are idle, the thread pool is empty:

> 0xffffd0632f835068::print proc_t p_server_threads
p_server_threads = {
    p_server_threads.dp_threads = 0
    p_server_threads.dp_cv = {
        _opaque = 0x4f
    }
}

And nothing is stuck in door_release_server():

p_server_threads.dp_threads is empty so we have no servers available and nothing appears to be stuck releasing servers:

> ::stacks -c door_release_server
>

This problem is also affecting smb. Just an example thread:

> 0xffffd001efd41c40::findstack -v
stack pointer for thread ffffd001efd41c40: ffffd001efd41640
[ ffffd001efd41640 _resume_from_idle+0x112() ]
  ffffd001efd41670 swtch+0x141()
  ffffd001efd41710 turnstile_block+0x21a(0, 0, ffffd063c2e51d60, fffffffffbc09ca0, 0, 0)
  ffffd001efd41780 mutex_vector_enter+0x3a3(ffffd063c2e51d60)
  ffffd001efd417b0 smb_odir_close+0x23(ffffd063c2e51d48)
  ffffd001efd41800 smb_ofile_close+0x136(ffffd06eb40fb080, 0)
  ffffd001efd41840 smb_ofile_close_and_next+0xa5(ffffd06eb40fb080)
  ffffd001efd41870 smb_ofile_close_all+0x40(ffffd06557b8b090)
  ffffd001efd41910 smb_tree_disconnect+0xf8(ffffd06557b8b090, 1)
  ffffd001efd41940 smb_session_tree_dtor+0x1d(ffffd06557b8b090)
  ffffd001efd41980 smb_llist_flush+0x5e(ffffd06566a78138)
  ffffd001efd419b0 smb_llist_exit+0x20(ffffd06566a78138)
  ffffd001efd419f0 smb_session_disconnect_owned_trees+0x79(ffffd06566a77be8, ffffd06e74a9e318)
  ffffd001efd41a30 smb_user_logoff+0x9c(ffffd06e74a9e318)
  ffffd001efd41ac0 smb2_session_setup+0x29c(ffffd06ede695cd0)
  ffffd001efd41b50 smb2sr_work+0x352(ffffd06ede695cd0)
  ffffd001efd41b90 smb2_tq_work+0x73(ffffd06ede695cd0)
  ffffd001efd41c20 taskq_d_thread+0xb7(ffffd070860aee00)
  ffffd001efd41c30 thread_start+8()
> ffffd063c2e51d60::mutex
            ADDR  TYPE             HELD MINSPL OLDSPL WAITERS
ffffd063c2e51d60 adapt ffffd001efbbbc40      -      -     yes
> ffffd001efbbbc40::findstack -v
stack pointer for thread ffffd001efbbbc40: ffffd001efbbb080
[ ffffd001efbbb080 _resume_from_idle+0x112() ]
  ffffd001efbbb0b0 swtch+0x141()
  ffffd001efbbb150 turnstile_block+0x21a(ffffd0722d60e6b0, 0, ffffd063e49802e0, fffffffffbc09ca0, 0, 0)
  ffffd001efbbb1c0 mutex_vector_enter+0x3a3(ffffd063e49802e0)
  ffffd001efbbb280 zfs_zaccess_aces_check+0x62(ffffd063e4980218, ffffd001efbbb29c, 1, ffffd063c1e56788)
  ffffd001efbbb2c0 zfs_has_access+0x36(ffffd063e4980218, ffffd063c1e56788)
  ffffd001efbbb540 zfs_readdir+0x614(ffffd063c40d3a00, ffffd001efbbb5f0, ffffd063c1e56788, ffffd001efbbb6bc,
  ffffffffc0301fb0, 3)
  ffffd001efbbb5d0 fop_readdir+0x6b(ffffd063c40d3a00, ffffd001efbbb5f0, ffffd063c1e56788, ffffd001efbbb6bc,
  ffffffffc0301fb0, 3)
  ffffd001efbbb690 smb_vop_readdir+0x138(ffffd063c40d3a00, 0, ffffd063c2e51eec, ffffd063c2e51dac,
  ffffd001efbbb6bc, 8, ffffd063c1e56788)
  ffffd001efbbb6f0 smb_odir_next_odirent+0x114(ffffd063c2e51d48, ffffd064bc9a51c0)
  ffffd001efbbb780 smb_odir_read_fileinfo+0xf8(ffffd063d1063930, ffffd063c2e51d48, ffffd001efbbb8e8,
  ffffd001efbbb8d8)
  ffffd001efbbb830 smb2_find_entries+0x11a(ffffd063d1063930, ffffd063c2e51d48, ffffd001efbbb8d0)
  ffffd001efbbbac0 smb2_query_dir+0x3f5(ffffd063d1063930)
  ffffd001efbbbb50 smb2sr_work+0x352(ffffd063d1063930)
  ffffd001efbbbb90 smb2_tq_work+0x73(ffffd063d1063930)
  ffffd001efbbbc20 taskq_d_thread+0xb7(ffffd09b9a105d30)
  ffffd001efbbbc30 thread_start+8()
> ffffd063e49802e0::mutex
            ADDR  TYPE             HELD MINSPL OLDSPL WAITERS
ffffd063e49802e0 adapt ffffd001eebf5c40      -      -     yes
> ffffd001eebf5c40::findstack -v
stack pointer for thread ffffd001eebf5c40: ffffd001eebf46a0
[ ffffd001eebf46a0 _resume_from_idle+0x112() ]
  ffffd001eebf46e0 swtch_to+0xb6(ffffd063c4dec760)
  ffffd001eebf4730 shuttle_resume+0x2af(ffffd063c4dec760, fffffffffbd17130)
  ffffd001eebf47e0 door_upcall+0x212(ffffd063e9e23980, ffffd001eebf49b0, 0, ffffffffffffffff, 0)
  ffffd001eebf4860 door_ki_upcall_limited+0x67(ffffd0637fb31f60, ffffd001eebf49b0, 0, ffffffffffffffff, 0)
  ffffd001eebf48c0 kidmap_call_door+0xb3(ffffd063758819c0, ffffd001eebf49b0)
  ffffd001eebf4a70 kidmap_rpc_call+0x213(ffffd063758819c0, 1, fffffffff7940ff0, ffffd001eebf4b50,
  fffffffff7940eb0, ffffd001eebf4b30)
  ffffd001eebf4bb0 kidmap_getuidbysid+0x102(fffffffffbcf3f00, ffffd063bd2e9718, 8401, ffffd001eebf4bdc)
  ffffd001eebf4c10 zfs_fuid_map_id+0x70(ffffd0639c314800, 200008401, ffffd062ddd36db0, 2)
  ffffd001eebf4c90 zfs_user_in_cred+0x5c(ffffd0639c314800, 200008401, ffffd062ddd36db0)
  ffffd001eebf4d50 zfs_zaccess_aces_check+0x133(ffffd063e4980218, ffffd001eebf4e1c, 0, ffffd062ddd36db0)
  ffffd001eebf4dd0 zfs_zaccess_common+0xd8(ffffd063e4980218, 80, ffffd001eebf4e1c, ffffd001eebf4e18, 0,
  ffffd062ddd36db0)
  ffffd001eebf4e80 zfs_zaccess+0xeb(ffffd063e4980218, 80, 0, 0, ffffd062ddd36db0)
  ffffd001eebf4f90 zfs_getattr+0x166(ffffd09b95f66340, ffffd001eebf50e0, 0, ffffd062ddd36db0, ffffffffc0301fb0)
  ffffd001eebf5010 fop_getattr+0xa8(ffffd09b95f66340, ffffd001eebf50e0, 0, ffffd062ddd36db0, ffffffffc0301fb0)
  ffffd001eebf5250 smb_vop_getattr+0xc2(ffffd09b95f66340, 0, ffffd001eebf5490, 0, ffffd062ddd36db0)
  ffffd001eebf5440 smb_vop_lookup+0x19b(ffffd063c40d3a00, ffffd071dc20f3c0, ffffd001eebf5540, ffffd06f3ac0e780,
  18, ffffd001eebf555c, ffffd063c40d3a00, ffffd001eebf5490, ffffd06460f42118)
  ffffd001eebf55c0 smb_fsop_lookup+0x15c(ffffd09b5143e2e0, ffffd06460f42118, 10, ffffd063c2d1d688,
  ffffd063c2d1d688, ffffd071dc20f3c0, ffffd001eebf5698)
  ffffd001eebf56f0 smb_odir_wildcard_fileinfo+0x70(ffffd09b5143e2e0, ffffd063d200b2a8, ffffd071dc20f3c0,
  ffffd001eebf58e8)
  ffffd001eebf5780 smb_odir_read_fileinfo+0x13e(ffffd09b5143e2e0, ffffd063d200b2a8, ffffd001eebf58e8,
  ffffd001eebf58d8)
  ffffd001eebf5830 smb2_find_entries+0x11a(ffffd09b5143e2e0, ffffd063d200b2a8, ffffd001eebf58d0)
  ffffd001eebf5ac0 smb2_query_dir+0x3f5(ffffd09b5143e2e0)
  ffffd001eebf5b50 smb2sr_work+0x352(ffffd09b5143e2e0)
  ffffd001eebf5b90 smb2_tq_work+0x73(ffffd09b5143e2e0)
  ffffd001eebf5c20 taskq_d_thread+0xb7(ffffd0ea3ad38d28)
  ffffd001eebf5c30 thread_start+8()
> ffffd001eebf5c40::thread -d
            ADDR         DISPTIME BOUND PR SWITCH
ffffd001eebf5c40          6c8bca8    -1  0 t-8543293

So, somehow we end up with everything stuck in some calls that end up getting stuck talking to idmap via door calls. At the same time idmapd is idle but its server pool is empty.

History

#1

Updated by Gordon Ross 11 months ago

  • Description updated (diff)
  • Status changed from New to In Progress
#2

Updated by Gordon Ross 11 months ago

Analysis from J B:

(lots of) threads starting in SMB and blocked in ZFS waiting on a mutex.
What's the bet we're going after some form if idmap lookup?

Let's start with one of the threads that is most commonly seen:


> ffffd001f1487c40::findstack -v
stack pointer for thread ffffd001f1487c40: ffffd001f1487a40
[ ffffd001f1487a40 _resume_from_idle+0x112() ]
  ffffd001f1487a70 swtch+0x141()
  ffffd001f1487ab0 cv_wait+0x70(ffffd11461920b18, ffffd11461920b10)
  ffffd001f1487ae0 smb_slist_wait_for_empty+0x42(ffffd11461920b10)
  ffffd001f1487b20 smb_session_cancel+0x3b(ffffd114619206c8)
  ffffd001f1487b60 smb_session_receiver+0x9f(ffffd114619206c8)
  ffffd001f1487b90 smb_server_receiver+0x18(ffffd114619206c8)
  ffffd001f1487c20 taskq_d_thread+0xb7(ffffd113bc72dd38)
  ffffd001f1487c30 thread_start+8()

> smb_slist_wait_for_empty::nm -f ctype
C Type
void (*)(smb_slist_t *)

> ffffd11461920b10::print smb_slist_t
{
    sl_mutex = {
        _opaque = [ 0 ]
    }
    sl_cv = {
        _opaque = 0x1
    }
    sl_list = {
        list_size = 0x640
        list_offset = 0
        list_head = {
            list_next = 0xffffd11420b7c308
            list_prev = 0xffffd11469f376a8
        }
    }
    sl_count = 0x7
    sl_waiting = 0x1 (B_TRUE)
}

OK, we can see that sl_count > 0 so that'll be why we're waiting.
We only decrement this via smb_slist_remove()

::stacks -c smb_slist_remove

None of which are present in the dump.
That function is called by:

  File          Function                   Line
0 smb_kshare.c  smb_kshare_fini             344 smb_slist_remove(&sv->sv_export.e_unexport_list, ux);
1 smb_kshare.c  smb_kshare_unexport_thread 1398 smb_slist_remove(&sv->sv_export.e_unexport_list, ux);
2 smb_session.c smb_request_free           1533 smb_slist_remove(&sr->session->s_req_list, sr);

> ::stacks -c smb_kshare_fini
> ::stacks -c smb_request_free
> ::stacks -c smb_kshare_unexport_thread
THREAD           STATE    SOBJ                COUNT
ffffd001f20c3c40 SLEEP    CV                      1
                 swtch+0x141
                 cv_wait+0x70
                 smb_thread_continue_timedwait_locked+0x5d
                 smb_thread_continue+0x2d
                 smb_kshare_unexport_thread+0x40
                 smb_thread_entry_point+0x53
                 thread_start+8

Of which we only see one of those calling functions but not in any active state.
OK, so 139 threads started in SMB and made door calls ... like this:
> ffffd001f35c3c40::findstack -v
stack pointer for thread ffffd001f35c3c40: ffffd001f35c3600
[ ffffd001f35c3600 _resume_from_idle+0x112() ]
  ffffd001f35c3630 swtch+0x141()
  ffffd001f35c3670 cv_wait+0x70(ffffd0638ec574d8, fffffffffbd17130)
  ffffd001f35c3700 cv_wait_sig_swap_core+0x8a(ffffd0638ec574d8, fffffffffbd17130, ffffd001f35c372c)
  ffffd001f35c3770 door_get_server+0xc1(ffffd11b9b2e0018)
  ffffd001f35c3820 door_upcall+0x166(ffffd11483883880, ffffd001f35c38b0, 0, ffffffffffffffff, 0)
  ffffd001f35c38a0 door_ki_upcall_limited+0x67(ffffd11300827de8, ffffd001f35c38b0, 0, ffffffffffffffff, 0)
  ffffd001f35c3910 smb_kdoor_upcall_private+0x65(ffffd1134e610000, ffffd001f35c3930)
  ffffd001f35c3a30 smb_kdoor_upcall+0x24f(ffffd1134e610000, 4, ffffd001f35c3a4c, fffffffffbb69960, 0, 0)
  ffffd001f35c3a60 smb_user_auth_logoff+0x34(ffffd1140dc5d000)
  ffffd001f35c3aa0 smb_user_logoff+0xa4(ffffd1140dc5d000)
  ffffd001f35c3ae0 smb_session_logoff+0x67(ffffd1146a9f8e28)
  ffffd001f35c3b20 smb_session_cancel+0x80(ffffd1146a9f8e28)
  ffffd001f35c3b60 smb_session_receiver+0x9f(ffffd1146a9f8e28)
  ffffd001f35c3b90 smb_server_receiver+0x18(ffffd1146a9f8e28)
  ffffd001f35c3c20 taskq_d_thread+0xb7(ffffd6855f7134a8)
  ffffd001f35c3c30 thread_start+8()

> door_get_server::nm -f ctype
C Type
kthread_t * (*)(door_node_t *)

> ffffd11b9b2e0018::print door_node_t
{
    door_vnode = 0xffffd11483883880
    door_target = 0xffffd0638ec57070
    door_list = 0
    door_ulist = 0
    door_pc = 0x8059f28
    door_data = 0x806095c
    door_index = 0xdcf6d
    door_flags = 0x1
    door_active = 0x21
    door_servers = {
        dp_threads = 0
        dp_cv = {
            _opaque = 0
        }
    }
    door_data_max = 0xffffffff
    door_data_min = 0
    door_desc_max = 0x7fffffff
    door_bound_threads = 0
}

> ffffd11b9b2e0018::print door_node_t door_target | ::ps -f
S    PID   PPID   PGID    SID    UID      FLAGS             ADDR NAME
R  27306      1  27306  27306      0 0x42000000 ffffd0638ec57070 /usr/lib/smbsrv/smbd start

OK, it's trying to talk to smbd
> ffffd0638ec57070::walk thread ! wc -l
     346

And there are a lot of smbd threads in door_get_server() calls, eg:
stack pointer for thread ffffd06370941b60: ffffd001ebcedc60
[ ffffd001ebcedc60 _resume_from_idle+0x112() ]
  ffffd001ebcedc90 swtch+0x141()
  ffffd001ebcedd20 cv_wait_sig_swap_core+0x1b9(ffffd0638509d4a0, fffffffffbd17130, ffffd001ebcedd4c)
  ffffd001ebcedd90 door_get_server+0xc1(ffffd11b9fee5498)
  ffffd001ebcede50 door_call+0x25f(a2, e6b329d0)
  ffffd001ebcedec0 doorfs32+0xa7(a2, e6b329d0, 0, 0, 0, 3)
  ffffd001ebcedf10 _sys_sysenter_post_swapgs+0x149()
stack pointer for thread ffffd06371894b60: ffffd001f1e98c60
[ ffffd001f1e98c60 _resume_from_idle+0x112() ]
  ffffd001f1e98c90 swtch+0x141()
  ffffd001f1e98d20 cv_wait_sig_swap_core+0x1b9(ffffd0638509d4a0, fffffffffbd17130, ffffd001f1e98d4c)
  ffffd001f1e98d90 door_get_server+0xc1(ffffd11b9fee5498)
  ffffd001f1e98e50 door_call+0x25f(a2, e6a339d0)
  ffffd001f1e98ec0 doorfs32+0xa7(a2, e6a339d0, 0, 0, 0, 3)
  ffffd001f1e98f10 _sys_sysenter_post_swapgs+0x149()
stack pointer for thread ffffd063718947c0: ffffd001f548fc60
[ ffffd001f548fc60 _resume_from_idle+0x112() ]
  ffffd001f548fc90 swtch+0x141()
  ffffd001f548fd20 cv_wait_sig_swap_core+0x1b9(ffffd0638509d4a0, fffffffffbd17130, ffffd001f548fd4c)
  ffffd001f548fd90 door_get_server+0xc1(ffffd11b9fee5498)
  ffffd001f548fe50 door_call+0x25f(a2, e69349d0)
  ffffd001f548fec0 doorfs32+0xa7(a2, e69349d0, 0, 0, 0, 3)
  ffffd001f548ff10 _sys_sysenter_post_swapgs+0x149()
stack pointer for thread ffffd06371a44860: ffffd001f5495c60
[ ffffd001f5495c60 _resume_from_idle+0x112() ]
  ffffd001f5495c90 swtch+0x141()
  ffffd001f5495d20 cv_wait_sig_swap_core+0x1b9(ffffd0638509d4a0, fffffffffbd17130, ffffd001f5495d4c)
  ffffd001f5495d90 door_get_server+0xc1(ffffd11b9fee5498)
  ffffd001f5495e50 door_call+0x25f(a2, e68359d0)
  ffffd001f5495ec0 doorfs32+0xa7(a2, e68359d0, 0, 0, 0, 3)
  ffffd001f5495f10 _sys_sysenter_post_swapgs+0x149()

... which is idmap ...
> ffffd11b9fee5498::print door_node_t door_target | ::ps -f
S    PID   PPID   PGID    SID    UID      FLAGS             ADDR NAME
R  27275      1  27275  27275      1 0x42000000 ffffd0638509d038 /usr/lib/idmapd

> ffffd0638509d038::walk thread | ::findstack -v
mdb: thread ffffd11352162c00 isn't in memory
mdb: thread ffffd0fb82d3e180 isn't in memory
mdb: thread ffffd0fb82d46b80 isn't in memory
mdb: thread ffffd063715fb820 isn't in memory
mdb: thread ffffd113521ecb60 isn't in memory
mdb: thread ffffd0fb82d2f8c0 isn't in memory
mdb: thread ffffd06372f35b80 isn't in memory
mdb: thread ffffd11352161880 isn't in memory
mdb: thread ffffd11360f5ab60 isn't in memory
mdb: thread ffffd113435c14e0 isn't in memory
mdb: thread ffffd06371d5bc00 isn't in memory
mdb: thread ffffd06372f428c0 isn't in memory
mdb: thread ffffd0fb82d40140 isn't in memory
mdb: thread ffffd11361041080 isn't in memory
mdb: thread ffffd06372ef4b20 isn't in memory
mdb: thread ffffd0fb82d03c40 isn't in memory
mdb: thread ffffd0fb82d62160 isn't in memory
mdb: thread ffffd0fb82d48b40 isn't in memory
mdb: thread ffffd0fb82d3c760 isn't in memory
mdb: thread ffffd0fb82d40880 isn't in memory
mdb: thread ffffd06390033c60 isn't in memory
mdb: thread ffffd0fb82d46440 isn't in memory
mdb: thread ffffd06372f3c400 isn't in memory
mdb: thread ffffd063ad1d5420 isn't in memory
mdb: thread ffffd0fb82cff040 isn't in memory
mdb: thread ffffd0fb82d028c0 isn't in memory
mdb: thread ffffd06372d51160 isn't in memory
mdb: thread ffffd0fb82d60c60 isn't in memory
mdb: thread ffffd06371b8f120 isn't in memory
mdb: thread ffffd0632ec500a0 isn't in memory
mdb: thread ffffd06371d5e100 isn't in memory
mdb: thread ffffd113520b9b00 isn't in memory
mdb: thread ffffd0fb82d2c840 isn't in memory
mdb: thread ffffd0fb82d16b40 isn't in memory
mdb: thread ffffd0fb82d40c20 isn't in memory
mdb: thread ffffd063ad1d67a0 isn't in memory
mdb: thread ffffd06372cd3140 isn't in memory
mdb: thread ffffd0fb82cf6160 isn't in memory
mdb: thread ffffd0fb82d44460 isn't in memory
mdb: thread ffffd0fb82d00b00 isn't in memory
mdb: thread ffffd0fb82d29160 isn't in memory
mdb: thread ffffd0fb82d003c0 isn't in memory
mdb: thread ffffd06371d5b860 isn't in memory
mdb: thread ffffd1180a642020 isn't in memory

Also, looking at the system logs, we've apparently had trouble contacting AD:

@ Thu Feb 21 03:14:23 2019
AD ldap_search_ext error (Can't contact LDAP server) - 1 queued requests
Using server vidc8.rdg-home.ad.rdg.ac.uk:389
@ Thu Feb 21 03:27:44 2019
thread creation refused - 40 threads currently active
thread creation refused - 40 threads currently active
thread creation refused - 40 threads currently active
thread creation refused - 40 threads currently active
thread creation refused - 40 threads currently active
thread creation refused - 40 threads currently active
thread creation refused - 40 threads currently active
thread creation refused - 40 threads currently active
...

#3

Updated by Gordon Ross 11 months ago

We were able to get gcores from idmapd and smbd, which allowed tracking down a deadlock that can happen if a call from idmapd to smbd (e.g. lsa_lookup_sid) is unlucky enough to need to call the DC locator functionality in idmap. That call sequecence (idmapd calls smbd, which calls back to idmapd) can put those two processes into a deadlock situation.

Until we can remove those "reverse direction" calls, we need to make sure they're safe for idmap to call.
That means they should not block for a long time, and should avoid calling back into idmap again.

The calls in question are all conditional on the "use_lsa" idmap configuration option. We've been working around this problem by setting that false. That's OK as a work-around, but can cause some differences in idmap mapping results.

The main reason for the "use_lsa" option is to allow idmap to lookup local SMB user and group accounts (and find out what type they are). Currently, those calls also lookup account info via the SMB server's connection out to an AD server.
Those calls out to AD may require getting updated information from idmap about who is the AD server.
Getting the AD server info can block, and the calls out to the AD server can also block.

The idmap code doesn't actually need the "use_lsa" calls to go out to AD because idmap can do lookup calls to AD on its own, and will do so after the "use_lsa" calls are done. The near-term fix I'm using is to let idmap use a variant of these "use_lsa" calls to smbd that perform only the local lookup operations, skipping the blocking calls related to making a request out to AD.

Just to clarify NFS and ZFS are not involved in this problem other than as a victims.

#4

Updated by Gordon Ross 11 months ago

Fix is out for review as part of this PR:
https://github.com/illumos/illumos-gate/pull/68

#5

Updated by Gordon Ross 11 months ago

  • Description updated (diff)

Added more complete copy of the crash dump analysis.

#6

Updated by Electric Monk 10 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

git commit dafb549fce563c2f4123e8957dc8a1d921bf0fca

commit  dafb549fce563c2f4123e8957dc8a1d921bf0fca
Author: Gordon Ross <gwr@nexenta.com>
Date:   2019-11-14T14:23:07.000Z

    11039 All zfs/nfs/smb threads in door calls to idle idmap
    Reviewed by: Matt Barden <matt.barden@nexenta.com>
    Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
    Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
    Reviewed by: Andrew Stormont <astormont@racktopsystems.com>
    Approved by: Garrett D'Amore <garrett@damore.org>

Also available in: Atom PDF