Bug #8040
closedNFSv4 client: 3-way deadlock between nfs4_bio(), nfs4_do_delegreturn(), and nfs4_flush_pages()
100%
Description
A 3-way deadlock between nfs4_readahead(), nfs4_do_delegreturn(), and nfs4_purge_caches() threads was detected.
The nfs4_readahead() thread is waiting for the nfs4_do_delegreturn() thread, while the nfs4_do_delegreturn() thread is waiting for the nfs4_purge_caches() thread, and finally, the nfs4_purge_caches() is waiting for the nfs4_readahead() thread:
> ffffff0049061c40::findstack -v stack pointer for thread ffffff0049061c40: ffffff00490614b0 [ ffffff00490614b0 _resume_from_idle+0xf4() ] ffffff00490614e0 swtch+0x141() ffffff0049061520 cv_wait+0x70(ffffff0aa1daa750, ffffff0aa1daa748) ffffff0049061590 cv_wait_sig+0x82(ffffff0aa1daa750, ffffff0aa1daa748) ffffff00490615f0 nfs_rw_enter_sig+0x175(ffffff0aa1daa738, 1, 4) ffffff0049061640 wait_for_recall1+0x5e(ffffff0aaef08500, 1, ffffff0049061920, 8) ffffff00490616a0 wait_for_recall+0x48(ffffff0aaef08500, 0, 1, ffffff0049061920) ffffff0049061760 nfs4_start_fop+0xac(ffffff0aa6275000, ffffff0aaef08500, 0, 1, ffffff0049061920, 0) ffffff00490619b0 nfs4read+0x101(ffffff0aaef08500, ffffff0af2c29000, 14f8f8000, 8000, ffffff4c0c6bde50, ffffffce6a67e7c8, ffffff0a00000001, 0) ffffff0049061a70 nfs4_bio+0xf1(ffffff4c0c6bde00, 0, ffffffce6a67e7c8, 1) ffffff0049061b30 nfs4_readahead+0x185(ffffff0aaef08500, 14f8f8000, e1ee000, fffffffffbc34880, ffffffce6a67e7c8) ffffff0049061c00 nfs4_async_common_start+0x34d(ffffffce9bdd6140, 0) ffffff0049061c20 nfs4_async_start+0x13(ffffffce9bdd6140) ffffff0049061c30 thread_start+8() > 0xffffff0046ecdc40::findstack -v stack pointer for thread ffffff0046ecdc40: ffffff0046ecd5d0 [ ffffff0046ecd5d0 _resume_from_idle+0xf4() ] ffffff0046ecd600 swtch+0x141() ffffff0046ecd640 cv_wait+0x70(ffffff0aa1daa5a4, ffffff0aa1daa480) ffffff0046ecd6b0 cv_wait_sig+0x82(ffffff0aa1daa5a4, ffffff0aa1daa480) ffffff0046ecd760 nfs4_attr_cache+0x120(ffffff0aaef08500, ffffffc73ed05158, 2db841414e80ff, ffffff231f730400, 1, 0) ffffff0046ecd980 nfs4delegreturn_otw+0x153(ffffff0aa1daa3e8, ffffff231f730400, ffffff0046ecda90) ffffff0046ecdaf0 nfs4_do_delegreturn+0xea(ffffff0aa1daa3e8, 30, ffffff231f730400, ffffff0a8ec26b80) ffffff0046ecdb70 nfs4delegreturn_impl+0x1c9(ffffff0aa1daa3e8, 30, ffffff0a8ec26b80) ffffff0046ecdc20 nfs4delegreturn_thread+0x220(ffffff0af3540200) ffffff0046ecdc30 thread_start+8() > ffffff0a9efa27c0::findstack -v stack pointer for thread ffffff0a9efa27c0: ffffff004b79f070 [ ffffff004b79f070 _resume_from_idle+0xf4() ] ffffff004b79f0a0 swtch+0x141() ffffff004b79f0e0 cv_wait+0x70(ffffff0022af9f04, ffffff003cf040c0) ffffff004b79f160 page_lock_es+0x139(ffffff0022af9ec0, 1, fffffffffbc86590, 1, 0) ffffff004b79f190 page_lock+0x20(ffffff0022af9ec0, 1, fffffffffbc86590, 1) ffffff004b79f260 pvn_vplist_dirty+0x21c(ffffff0aaef08500, 0, fffffffff8157f20, 10000, ffffff231f730400) ffffff004b79f310 nfs4_putpages+0x298(ffffff0aaef08500, 0, 0, 10000, ffffff231f730400) ffffff004b79f3a0 nfs4_putpage+0xa9(ffffff0aaef08500, 0, 0, 10000, ffffff231f730400, 0) ffffff004b79f420 fop_putpage+0x4c(ffffff0aaef08500, 0, 0, 10000, ffffff231f730400, 0) ffffff004b79f460 nfs4_flush_pages+0x30(ffffff0aaef08500, ffffff231f730400) ffffff004b79f4f0 nfs4_purge_caches+0x1cb(ffffff0aaef08500, 0, ffffff231f730400, 0) ffffff004b79f5a0 nfs4_attr_cache+0x2b9(ffffff0aaef08500, ffffff938abe2080, 2db84105dd7ba7, ffffff231f730400, 0, 0) ffffff004b79f5f0 nfs4_update_attrcache+0x41(0, ffffff938abe2080, 2db84105dd7ba7, ffffff0aaef08500, ffffff231f730400) ffffff004b79fac0 nfs4setattr+0x70a(ffffff0aaef08500, ffffff004b79fae0, 0, ffffff231f730400, 0) ffffff004b79fbd0 nfs4_space+0xd7(ffffff0aaef08500, b, ffffff004b79fd30, 2002, 0, ffffff231f730400, 0) ffffff004b79fc60 fop_space+0x59(ffffff0aaef08500, b, ffffff004b79fd30, 2002, 0, ffffff231f730400, 0) ffffff004b79fec0 fcntl+0x474(f, 1b, 8047514) ffffff004b79ff10 _sys_sysenter_post_swapgs+0x149() >
All three threads are in such a state for very long time now:
> ffffff0049061c40::thread -d ADDR DISPTIME BOUND PR SWITCH ffffff0049061c40 4cb450ac -1 0 t-20115988 > 0xffffff0046ecdc40::thread -d ADDR DISPTIME BOUND PR SWITCH ffffff0046ecdc40 4cb450ac -1 0 t-20115988 > ffffff0a9efa27c0::thread -d ADDR DISPTIME BOUND PR SWITCH ffffff0a9efa27c0 4cb45048 -1 0 t-20116088 >
Files
Related issues
Updated by Marcel Telka over 6 years ago
Crash dump analysis
We detected a stuck nfs4_ephemeral_harvester()
:
> ::stacks -c nfs4_ephemeral_harvester|::thread -d ADDR DISPTIME BOUND PR SWITCH ffffff003e7f7c40 4cb49c5a -1 0 t-20096614 > ::stacks -c nfs4_ephemeral_harvester|::findstack -v stack pointer for thread ffffff003e7f7c40: ffffff003e7f7890 [ ffffff003e7f7890 _resume_from_idle+0xf4() ] ffffff003e7f78c0 swtch+0x141() ffffff003e7f7900 cv_wait+0x70(ffffff0aa627514e, ffffff0aa6275150) ffffff003e7f7970 cv_wait_sig+0x82(ffffff0aa627514e, ffffff0aa6275150) ffffff003e7f79c0 nfs4_async_stop_sig+0x84(ffffffce9bdd6140) ffffff003e7f7a30 nfs4_unmount+0x64(ffffffce9bdd6140, 8, ffffff0a871bfdb0) ffffff003e7f7a60 fsop_unmount+0x1b(ffffffce9bdd6140, 8, ffffff0a871bfdb0) ffffff003e7f7ab0 dounmount+0x57(ffffffce9bdd6140, 8, ffffff0a871bfdb0) ffffff003e7f7b00 umount2_engine+0x96(ffffffce9bdd6140, 8, ffffff0a871bfdb0, 0) ffffff003e7f7b40 nfs4_ephemeral_record_umount+0x39(ffffffce9bdd6140, 8, ffffff0a908a15c8, ffffff0be6aa9070) ffffff003e7f7bf0 nfs4_ephemeral_harvest_forest+0x1db(ffffff0a9dc4b888, 0, 1) ffffff003e7f7c20 nfs4_ephemeral_harvester+0x77(ffffff0a9dc4b888) ffffff003e7f7c30 thread_start+8() >
The nfs4_ephemeral_harvester()
thread is waiting for mi->mi_async_cv
with address ffffff0aa627514e
.
> ::offsetof mntinfo4_t mi_async_cv offsetof (mntinfo4_t, mi_async_cv) = 0x14e, sizeof (...->mi_async_cv) = 2 > ffffff0aa627514e-0x14e=K ffffff0aa6275000 >
The mntinfo4
is ffffff0aa6275000
.
> ffffff0aa6275000::print -a mntinfo4_t mi_async_reqs mi_async_tail ffffff0aa62750a0 mi_async_reqs = [ 0, 0, 0, 0, 0, 0 ] ffffff0aa62750d0 mi_async_tail = [ 0, 0, 0, 0xffffff0ab6a60400, 0, 0 ] >
The async requests queue is empty, but the last request we removed from there is still in mi_async_tail
. It is 0xffffff0ab6a60400
.
> 0xffffff0ab6a60400::print struct nfs4_async_reqs { a_next = 0 a_vp = 0xffffff0aaef08500 a_cred = 0xffffffce6a67e7c8 a_io = 3 (NFS4_READ_AHEAD) a_args = { a_read_args = { readahead = nfs4_readahead blkoff = 0x14f910000 seg = kmapseg addr = 0x1097000 } a_pageio_args = { pageio = nfs4_readahead pp = 0x14f910000 io_off = 0xfffffffffbc34880 io_len = 0x1097000 flags = 0 } a_readdir_args = { readdir = nfs4_readahead rdc = 0x14f910000 } a_commit_args = { commit = nfs4_readahead plist = 0x14f910000 offset = 0xfffffffffbc34880 count = 0x1097000 } } } >
The request I/O type is NFS4_READ_AHEAD
for vnode 0xffffff0aaef08500
. The request should be handled by the nfs4_readahead()
function (a_args.a_read_args.readahead
).
> ::stacks -c nfs4_readahead|::findstack -v!grep nfs4_readahead ffffff0049061b30 nfs4_readahead+0x185(ffffff0aaef08500, 14f8f8000, e1ee000, fffffffffbc34880, ffffffce6a67e7c8) ffffff004cf81b30 nfs4_readahead+0x185(ffffff0aaef08500, 14f908000, 1093000, fffffffffbc34880, ffffffce6a67e7c8) ffffff004593cb30 nfs4_readahead+0x185(ffffff0aaef08500, 14f900000, 108f000, fffffffffbc34880, ffffffce6a67e7c8) ffffff00434c5b30 nfs4_readahead+0x185(ffffff0aaef08500, 14f910000, 1097000, fffffffffbc34880, ffffffce6a67e7c8) >
We have 4 nfs4_readahead()
threads. All of them are working on our vnode (ffffff0aaef08500
) with different offsets.
> ::stacks -c nfs4_readahead THREAD STATE SOBJ COUNT ffffff0049061c40 SLEEP CV 4 swtch+0x141 cv_wait+0x70 cv_wait_sig+0x82 nfs_rw_enter_sig+0x175 wait_for_recall1+0x5e wait_for_recall+0x48 nfs4_start_fop+0xac nfs4read+0x101 nfs4_bio+0xf1 nfs4_readahead+0x185 nfs4_async_common_start+0x34d nfs4_async_start+0x13 thread_start+8 > ::stacks -c nfs4_readahead|::findstack -v!grep wait_for_recall\\\> ffffff00490616a0 wait_for_recall+0x48(ffffff0aaef08500, 0, 1, ffffff0049061920) ffffff004cf816a0 wait_for_recall+0x48(ffffff0aaef08500, 0, 1, ffffff004cf81920) ffffff004593c6a0 wait_for_recall+0x48(ffffff0aaef08500, 0, 1, ffffff004593c920) ffffff00434c56a0 wait_for_recall+0x48(ffffff0aaef08500, 0, 1, ffffff00434c5920) >
All 4 threads are waiting for the delegation recall for our vnode and they are waiting for that for pretty long time:
> ::stacks -c nfs4_readahead|::thread -d ADDR DISPTIME BOUND PR SWITCH ffffff0049061c40 4cb450ac -1 0 t-20115988 ffffff004cf81c40 4cb450ac -1 0 t-20115988 ffffff004593cc40 4cb450ac -1 0 t-20115988 ffffff00434c5c40 4cb450ac -1 0 t-20115988 >
The threads are waiting to get the ffffff0aa1daa738
nfs_rwlock
as READER
:
> ffffff0049061c40::findstack -v stack pointer for thread ffffff0049061c40: ffffff00490614b0 [ ffffff00490614b0 _resume_from_idle+0xf4() ] ffffff00490614e0 swtch+0x141() ffffff0049061520 cv_wait+0x70(ffffff0aa1daa750, ffffff0aa1daa748) ffffff0049061590 cv_wait_sig+0x82(ffffff0aa1daa750, ffffff0aa1daa748) ffffff00490615f0 nfs_rw_enter_sig+0x175(ffffff0aa1daa738, 1, 4) ffffff0049061640 wait_for_recall1+0x5e(ffffff0aaef08500, 1, ffffff0049061920, 8) ffffff00490616a0 wait_for_recall+0x48(ffffff0aaef08500, 0, 1, ffffff0049061920) ffffff0049061760 nfs4_start_fop+0xac(ffffff0aa6275000, ffffff0aaef08500, 0, 1, ffffff0049061920, 0) ffffff00490619b0 nfs4read+0x101(ffffff0aaef08500, ffffff0af2c29000, 14f8f8000, 8000, ffffff4c0c6bde50, ffffffce6a67e7c8, ffffff0a00000001, 0) ffffff0049061a70 nfs4_bio+0xf1(ffffff4c0c6bde00, 0, ffffffce6a67e7c8, 1) ffffff0049061b30 nfs4_readahead+0x185(ffffff0aaef08500, 14f8f8000, e1ee000, fffffffffbc34880, ffffffce6a67e7c8) ffffff0049061c00 nfs4_async_common_start+0x34d(ffffffce9bdd6140, 0) ffffff0049061c20 nfs4_async_start+0x13(ffffffce9bdd6140) ffffff0049061c30 thread_start+8() >
But the lock is held as WRITER
by 0xffffff0046ecdc40
:
> ffffff0aaef08500::print vnode_t v_data|::print -a rnode4_t r_deleg_recall_lock ffffff0aa1daa738 r_deleg_recall_lock = { ffffff0aa1daa738 r_deleg_recall_lock.count = 0xfffffffe ffffff0aa1daa73c r_deleg_recall_lock.waiters = 0 ffffff0aa1daa740 r_deleg_recall_lock.owner = 0xffffff0046ecdc40 ffffff0aa1daa748 r_deleg_recall_lock.lock = { ffffff0aa1daa748 _opaque = [ 0 ] } ffffff0aa1daa750 r_deleg_recall_lock.cv = { ffffff0aa1daa750 _opaque = 0x4 } } >
The 0xffffff0046ecdc40
thread is blocked on a condition variable:
> 0xffffff0046ecdc40::findstack -v stack pointer for thread ffffff0046ecdc40: ffffff0046ecd5d0 [ ffffff0046ecd5d0 _resume_from_idle+0xf4() ] ffffff0046ecd600 swtch+0x141() ffffff0046ecd640 cv_wait+0x70(ffffff0aa1daa5a4, ffffff0aa1daa480) ffffff0046ecd6b0 cv_wait_sig+0x82(ffffff0aa1daa5a4, ffffff0aa1daa480) ffffff0046ecd760 nfs4_attr_cache+0x120(ffffff0aaef08500, ffffffc73ed05158, 2db841414e80ff, ffffff231f730400, 1, 0) ffffff0046ecd980 nfs4delegreturn_otw+0x153(ffffff0aa1daa3e8, ffffff231f730400, ffffff0046ecda90) ffffff0046ecdaf0 nfs4_do_delegreturn+0xea(ffffff0aa1daa3e8, 30, ffffff231f730400, ffffff0a8ec26b80) ffffff0046ecdb70 nfs4delegreturn_impl+0x1c9(ffffff0aa1daa3e8, 30, ffffff0a8ec26b80) ffffff0046ecdc20 nfs4delegreturn_thread+0x220(ffffff0af3540200) ffffff0046ecdc30 thread_start+8() > 0xffffff0046ecdc40::thread -d ADDR DISPTIME BOUND PR SWITCH ffffff0046ecdc40 4cb450ac -1 0 t-20115988 >
We are waiting there because r_serial
is non-NULL
:
483 while (rp->r_serial != NULL) { 484 if (!cv_wait_sig(&rp->r_cv, &rp->r_statelock)) { 485 mutex_exit(&rp->r_statelock); 486 if (lwp != NULL) 487 lwp->lwp_nostop--; 488 return; 489 } 490 }
The r_serial is a pointer to another nfs4_attr_cache() thread we are waiting for:
> ffffff0aaef08500::print vnode_t v_data|::print rnode4_t r_serial|::findstack -v stack pointer for thread ffffff0a9efa27c0: ffffff004b79f070 [ ffffff004b79f070 _resume_from_idle+0xf4() ] ffffff004b79f0a0 swtch+0x141() ffffff004b79f0e0 cv_wait+0x70(ffffff0022af9f04, ffffff003cf040c0) ffffff004b79f160 page_lock_es+0x139(ffffff0022af9ec0, 1, fffffffffbc86590, 1, 0) ffffff004b79f190 page_lock+0x20(ffffff0022af9ec0, 1, fffffffffbc86590, 1) ffffff004b79f260 pvn_vplist_dirty+0x21c(ffffff0aaef08500, 0, fffffffff8157f20, 10000, ffffff231f730400) ffffff004b79f310 nfs4_putpages+0x298(ffffff0aaef08500, 0, 0, 10000, ffffff231f730400) ffffff004b79f3a0 nfs4_putpage+0xa9(ffffff0aaef08500, 0, 0, 10000, ffffff231f730400, 0) ffffff004b79f420 fop_putpage+0x4c(ffffff0aaef08500, 0, 0, 10000, ffffff231f730400, 0) ffffff004b79f460 nfs4_flush_pages+0x30(ffffff0aaef08500, ffffff231f730400) ffffff004b79f4f0 nfs4_purge_caches+0x1cb(ffffff0aaef08500, 0, ffffff231f730400, 0) ffffff004b79f5a0 nfs4_attr_cache+0x2b9(ffffff0aaef08500, ffffff938abe2080, 2db84105dd7ba7, ffffff231f730400, 0, 0) ffffff004b79f5f0 nfs4_update_attrcache+0x41(0, ffffff938abe2080, 2db84105dd7ba7, ffffff0aaef08500, ffffff231f730400) ffffff004b79fac0 nfs4setattr+0x70a(ffffff0aaef08500, ffffff004b79fae0, 0, ffffff231f730400, 0) ffffff004b79fbd0 nfs4_space+0xd7(ffffff0aaef08500, b, ffffff004b79fd30, 2002, 0, ffffff231f730400, 0) ffffff004b79fc60 fop_space+0x59(ffffff0aaef08500, b, ffffff004b79fd30, 2002, 0, ffffff231f730400, 0) ffffff004b79fec0 fcntl+0x474(f, 1b, 8047514) ffffff004b79ff10 _sys_sysenter_post_swapgs+0x149() > ffffff0a9efa27c0::thread -d ADDR DISPTIME BOUND PR SWITCH ffffff0a9efa27c0 4cb45048 -1 0 t-20116088 >
The thread is operating on page ffffff0022af9ec0
and it is waiting until the page is unlocked. Who is holding the lock on the page?
> ffffff0022af9ec0::print page_t p_offset p_vnode p_selock p_offset = 0x14f8f8000 p_vnode = 0xffffff0aaef08500 p_selock = 0x89061c40 >
The page belongs to vnode 0xffffff0aaef08500
, offset 0x14f8f8000
and the lock holder is stored in p_selock
. But the pointer to the holder is mangled there. Only low 30 bits of its address are stored in p_selock
. Highest bit in the p_selock
is always set to 1, and the 2nd highest bit is always cleared (see the SE_WRITER
definition). There are two possible threads holding the page locked:
> ::thread!grep 9061c40 ffffff0049061c40 sleep 8 2000 3 60 0 0 n/a ffffff0079061c40 sleep 8 2000 3 60 0 0 n/a > ffffff0049061c40::findstack -v stack pointer for thread ffffff0049061c40: ffffff00490614b0 [ ffffff00490614b0 _resume_from_idle+0xf4() ] ffffff00490614e0 swtch+0x141() ffffff0049061520 cv_wait+0x70(ffffff0aa1daa750, ffffff0aa1daa748) ffffff0049061590 cv_wait_sig+0x82(ffffff0aa1daa750, ffffff0aa1daa748) ffffff00490615f0 nfs_rw_enter_sig+0x175(ffffff0aa1daa738, 1, 4) ffffff0049061640 wait_for_recall1+0x5e(ffffff0aaef08500, 1, ffffff0049061920, 8) ffffff00490616a0 wait_for_recall+0x48(ffffff0aaef08500, 0, 1, ffffff0049061920) ffffff0049061760 nfs4_start_fop+0xac(ffffff0aa6275000, ffffff0aaef08500, 0, 1, ffffff0049061920, 0) ffffff00490619b0 nfs4read+0x101(ffffff0aaef08500, ffffff0af2c29000, 14f8f8000, 8000, ffffff4c0c6bde50, ffffffce6a67e7c8, ffffff0a00000001, 0) ffffff0049061a70 nfs4_bio+0xf1(ffffff4c0c6bde00, 0, ffffffce6a67e7c8, 1) ffffff0049061b30 nfs4_readahead+0x185(ffffff0aaef08500, 14f8f8000, e1ee000, fffffffffbc34880, ffffffce6a67e7c8) ffffff0049061c00 nfs4_async_common_start+0x34d(ffffffce9bdd6140, 0) ffffff0049061c20 nfs4_async_start+0x13(ffffffce9bdd6140) ffffff0049061c30 thread_start+8() > ffffff0079061c40::findstack -v stack pointer for thread ffffff0079061c40: ffffff0079061b20 [ ffffff0079061b20 _resume_from_idle+0xf4() ] ffffff0079061b50 swtch+0x141() ffffff0079061b90 cv_wait+0x70(ffffff43fef3a130, ffffff43fef3a150) ffffff0079061c20 nfs4_async_manager+0xf4(ffffffbb2b7ca440) ffffff0079061c30 thread_start+8() >
Obviously, the first one is more suspicious. It is also working on the same vnode ffffff0aaef08500
with the same offset 14f8f8000
. And what does nfs4_readahead()
before it calls nfs4_bio()
? It allocates and locks the page(s) for the area it wants to read.
Updated by Marcel Telka over 6 years ago
- File instr-8040.patch instr-8040.patch added
- File test-8040.c test-8040.c added
Steps to reproduce the hang
To reproduce the problem you need a NFS server and one NFS client.
As a prerequisite we need to instrument the NFS client using the attached instr-8040.patch
file to enlarge the race windows to make sure we reproduce the hang reliably.
For the reproduction we will use a test application attached as test-8040.c
doing the right sequence of operations needed to reproduce the bug.
# Share a dir and create a file we will work with SERVER# share /export; mkfile 100m /export/test ; chmod 777 /export/test # Mount the shared directory CLIENT# mount -o vers=4 SERVER:/export /mnt # Enable the delay loops to enlarge the race windows CLIENT# mdb -kwe 'repro_8040/W 1' # Create the readahead and nfs4_space threads # Note: the following command will hang CLIENT# ./test-8040 /mnt/test # Make sure the grace period is over (if there is any) CLIENT# sleep 100 # remove the file at the server so the NFS server asks for deleg return SERVER# rm /export/test # Unblock the readahead thread so it can start to wait for the deleg return thread CLIENT# mdb -kwe 'repro_8040_readahead_thread/W 0' # Wait for a while to make sure the readahead thread is safely waiting for the deleg return CLIENT# sleep 2 # Unblock the nfs4_space thread so it can deadlock with the other two threads CLIENT# mdb -kwe 'repro_8040_space_thread/W 0'
Now you will find the NFS client hung:
# mdb -ke '::stacks -c page_lock_es|::thread -d' ADDR DISPTIME BOUND PR SWITCH ffffff01f7b90ba0 8a78 -1 0 t-62930 #
Updated by Marcel Telka over 6 years ago
With slightly modified instrumentation I'm able to reproduce the deadlock using nfs4_read()
thread too:
> ::stacks -m nfs|::stacks -C nfs4_inactive_thread|::stacks -C nfs4_async_manager|::stacks -C nfssys|::stacks -C nfs4_renew_lease_thread THREAD STATE SOBJ COUNT ffffff0006b17c40 SLEEP CV 1 swtch+0x18a cv_wait+0x89 cv_wait_sig+0x2cb nfs4_attr_cache+0x148 nfs4delegreturn_otw+0x153 nfs4_do_delegreturn+0x111 nfs4delegreturn_impl+0x219 nfs4delegreturn_thread+0x22f thread_start+8 ffffff020f5283c0 SLEEP CV 1 swtch+0x18a cv_wait+0x89 page_lock_es+0x1c7 page_lock+0x20 pvn_vplist_dirty+0x244 nfs4_putpages+0x2dc nfs4_putpage+0xbc fop_putpage+0x4c nfs4_flush_pages+0x30 nfs4_purge_caches+0x1cb nfs4_attr_cache+0x32f nfs4_update_attrcache+0x8e nfs4setattr+0x78d nfs4_space+0x118 fop_space+0x59 fcntl+0x50b _sys_sysenter_post_swapgs+0x237 ffffff01f8131b60 SLEEP CV 1 swtch+0x18a cv_wait_sig+0x1b5 nfs_rw_enter_sig+0x17f wait_for_recall1+0x5e wait_for_recall+0x57 nfs4_start_fop+0x13a nfs4read+0x166 nfs4_bio+0x129 nfs4_getapage+0x30b pvn_getpages+0x181 nfs4_getpage+0x1ae fop_getpage+0x7e vpm_map_pages+0x303 vpm_data_copy+0x8a nfs4_read+0x315 fop_read+0x5b read+0x2b8 read32+0x1e _sys_sysenter_post_swapgs+0x237 > ::stacks -m nfs|::stacks -C nfs4_inactive_thread|::stacks -C nfs4_async_manager|::stacks -C nfssys|::stacks -C nfs4_renew_lease_thread|::thread -d ADDR DISPTIME BOUND PR SWITCH ffffff0006b17c40 4f73 -1 0 t-375593 ffffff020f5283c0 63fe -1 0 t-370334 ffffff01f8131b60 5652 -1 0 t-373834 >
Updated by Marcel Telka over 6 years ago
- Subject changed from NFSv4 client: 3-way deadlock between nfs4_readahead(), nfs4_do_delegreturn(), and nfs4_purge_caches() to NFSv4 client: 3-way deadlock between nfs4_bio(), nfs4_do_delegreturn(), and nfs4_flush_pages()
Updated by Marcel Telka over 6 years ago
Updated by Vitaliy Gusev about 6 years ago
In short, this is a deadlock between taking resources:
A: page->p_selock B: rnode4->r_deleg_recall_lock C: rnode4->r_serial (via rnode4->r_cv)
Threads T1 (nfs4_bio), T2 (nfs4_do_delegreturn), T3 (nfs4_flush_pages)
and order of getting resources :
T1: A, B T2: B, C T3: C, A
Updated by Dan McDonald over 3 years ago
Is the following gist another example of such a deadlock? NOTE: The client is being tested by having packets get corrupted, but that shouldn't be a reason for it to lock:
https://gist.github.com/danmcd/c0e2fdeb6ec0a1b4909277ac0b755de7
If this is a different bug, I'm happy to file it.
Updated by Marcel Telka over 3 years ago
Dan McDonald wrote:
Is the following gist another example of such a deadlock? NOTE: The client is being tested by having packets get corrupted, but that shouldn't be a reason for it to lock:
https://gist.github.com/danmcd/c0e2fdeb6ec0a1b4909277ac0b755de7
If this is a different bug, I'm happy to file it.
Sorry, the gist contains not enough data to confirm that. The clear sign of this bug is one hanging thread like this:
THREAD STATE SOBJ COUNT ffffff0006b17c40 SLEEP CV 1 swtch+0x18a cv_wait+0x89 cv_wait_sig+0x2cb nfs4_attr_cache+0x148 nfs4delegreturn_otw+0x153 nfs4_do_delegreturn+0x111 nfs4delegreturn_impl+0x219 nfs4delegreturn_thread+0x22f thread_start+8
IOW, there must be cv_wait_sig()
called from nfs4_attr_cache()
, which is usually (to cause this deadlock) called asynchronously (via nfs4delegreturn_thread()
in our case, and probably in all other possible cases too).
Updated by Dan McDonald over 3 years ago
So it sounds like I should file a distinct NFSv4 client bug then?
Updated by Marcel Telka over 3 years ago
Dan McDonald wrote:
So it sounds like I should file a distinct NFSv4 client bug then?
Not necessarily (yet). First check whether do you have such a thread (mdb -ke "::stacks -c nfs4_attr_cache"
). If you do, then it is very likely this bug, if you don't, then it is definitely not this bug.
Updated by Dan McDonald over 3 years ago
Marcel Telka wrote:
Not necessarily (yet). First check whether do you have such a thread (
mdb -ke "::stacks -c nfs4_attr_cache"
). If you do, then it is very likely this bug, if you don't, then it is definitely not this bug.
Not it! Opening a new bug, but will link it to this for bug-chasers.
Updated by Dan McDonald over 3 years ago
- Related to Bug #12827: NFSv4 client gets hard-hung in the face of high packet corruption added
Updated by Marcel Telka over 2 years ago
- Status changed from In Progress to Pending RTI
Updated by Marcel Telka over 2 years ago
Testing
I tested the fix using steps at #note-2 and
I confirmed the deadlock is no longer reproducible.
This fix is in production for about 3 years now without any regression found.
Updated by Marcel Telka over 2 years ago
Fix
The fix basically removes the cv_wait_sig()
call in nfs4_attr_cache()
that causes the hang and replaces it with the cache purge.
Updated by Electric Monk over 2 years ago
- Status changed from Pending RTI to Closed
- % Done changed from 0 to 100
git commit 6dc7d05754d992040097e8ba8f85e77512125c60
commit 6dc7d05754d992040097e8ba8f85e77512125c60 Author: Marcel Telka <marcel@telka.sk> Date: 2021-01-08T14:50:43.000Z 8040 NFSv4 client: 3-way deadlock between nfs4_bio(), nfs4_do_delegreturn(), and nfs4_flush_pages() Reviewed by: Arne Jansen <arne@die-jansens.de> Reviewed by: Vitaliy Gusev <gusev.vitaliy@icloud.com> Approved by: Gordon Ross <gordon.w.ross@gmail.com>