Bug #12681
Excess service times for NFS(v3) locking calls
Added by Gordon Ross 9 months ago. Updated 8 months ago.
100%
Description
Customer reports slow access from Linux clients running apache services.
Examination of network captures on both sides shows that "call back" attempts all fail
(where the server tries to do RPC bind and to open a connection back to the client).
It turns out there's a firewall preventing calls back to the client. However:
The customer insists this all works with (other vendor's) server.
Updated by Gordon Ross 9 months ago
Additional analysis of this problem (where we have a firewall blocking call-backs)
The NLM server gets into an unfortunate state with this configuration, apparently hung.
We forced a crash dump and found that we have 511 threads all blocked like this:
> ::stacks -m rpcmod THREAD STATE SOBJ COUNT ffffd09c24848820 SLEEP CV 511 swtch+0x141 cv_timedwait_sig_hires+0x39d cv_timedwait_sig+0x49 wait_for_lock+0x10c flk_wait_execute_request+0x230 flk_process_request+0x4ed reclock+0x5dd fs_frlock+0x187 zfs_frlock+0xf8 fop_frlock+0x65 nlm_vop_frlock+0x51 nlm_block+0xeb nlm_do_lock+0x2f5 nlm4_lock_4_svc+0x2a nlm_dispatch+0xfe nlm_prog_4+0x34 svc_getreq+0x1c1 svc_run+0x198 svc_do_run+0x8e nfssys+0x111 _sys_sysenter_post_swapgs+0x149
Let's have a look at maxthreads...
> ffffd09c24848820 ::findstack -v stack pointer for thread ffffd09c24848820: ffffd002e3781260 [ ffffd002e3781260 _resume_from_idle+0x112() ] ffffd002e3781290 swtch+0x141() ffffd002e3781330 cv_timedwait_sig_hires+0x39d(ffffd0c732c343d4, ffffd0940feccc80, 34630b8a000, 989680, 0) ffffd002e3781380 cv_timedwait_sig+0x49(ffffd0c732c343d4, ffffd0940feccc80, 11e88ee) ffffd002e37813c0 wait_for_lock+0x10c(ffffd0c732c34300) ffffd002e3781420 flk_wait_execute_request+0x230(ffffd0c732c34300) ffffd002e37814a0 flk_process_request+0x4ed(ffffd0c732c34300) ffffd002e3781630 reclock+0x5dd(ffffd09de5360580, ffffd002e3781a00, e, 10003, 0, ffffd002e3781650) ffffd002e37816d0 fs_frlock+0x187(ffffd09de5360580, 7, ffffd002e3781a00, 10003, 0, ffffd002e37818f0, ffffd09cfa33c760, 0) ffffd002e3781770 zfs_frlock+0xf8(ffffd09de5360580, 7, ffffd002e3781a00, 10003, 0, ffffd002e37818f0, ffffd09cfa33c760, 0) ffffd002e3781810 fop_frlock+0x65(ffffd09de5360580, 7, ffffd002e3781a00, 10003, 0, ffffd002e37818f0, ffffd09cfa33c760, 0) ffffd002e3781860 nlm_vop_frlock+0x51(ffffd09de5360580, 7, ffffd002e3781a00, 10003, 0, ffffd002e37818f0, ffffd09cfa33c760, 0) ffffd002e37819a0 nlm_block+0xeb(ffffd002e3781b60, ffffd09cb1d006b8, ffffd0a192ea4b10, ffffd09c39f5ddc8, ffffd002e3781a00, fffffffff868c330) ffffd002e3781ac0 nlm_do_lock+0x2f5(ffffd002e3781b60, ffffd002e3781b10, ffffd002e3781ca0, fffffffff868c300, 0, fffffffff868c330) ffffd002e3781af0 nlm4_lock_4_svc+0x2a(ffffd002e3781b60, ffffd002e3781b10, ffffd002e3781ca0) ffffd002e3781c20 nlm_dispatch+0xfe(ffffd002e3781ca0, ffffd0a146edea00, fffffffff8693610) ffffd002e3781c40 nlm_prog_4+0x34(ffffd002e3781ca0, ffffd0a146edea00) ffffd002e3781d20 svc_getreq+0x1c1(ffffd0a146edea00, ffffd0aaa5f29f00) ffffd002e3781d90 svc_run+0x198(ffffd0a02e2b0750) ffffd002e3781dd0 svc_do_run+0x8e(2) ffffd002e3781ec0 nfssys+0x111(e, febf0fbc) ffffd002e3781f10 _sys_sysenter_post_swapgs+0x149() > ffffd0a02e2b0750 ::print struct __svcpool { p_next = 0 p_prev = 0xffffd09c425573f8 p_id = 0x2 p_threads = 0x2 p_detached_threads = 0x1ff p_maxthreads = 0x200 p_redline = 0x1 p_reserved_threads = 0 p_thread_lock = { _opaque = [ 0 ] } p_asleep = 0 p_drowsy = 0 p_req_cv = { _opaque = 0 } p_timeout = 0x1f4 p_req_lock = { _opaque = [ 0 ] } p_reqs = 0x300e97 p_walkers = 0 p_max_same_xprt = 0x8 p_stksize = 0 p_closing = 0 p_creator_signaled = 0 p_creator_exit = 0 p_creator_cv = { _opaque = 0x1 } p_creator_lock = { _opaque = [ 0 ] } p_lcount = 0x1a6 p_lhead = 0xffffd0a4d8f831f0 p_lrwlock = { >
Yup. We're simply out of RPC threads in the lockd service!
We're also blocking against locks held by clients but where we've been unable to notify those clients (with a lock call-back) that they got the lock they asked for.
Updated by Gordon Ross 9 months ago
It turns out the dump shown above was misleading, and shows the state of things after the call-back attempts failed. That activity eventually will all time out, but that takes a while
We had some other dumps (sorry, don't have the stacks handy now) where we can see that all the NLM RPC service threads are blocked in rpcbind_getaddr calls. Looking into that some more, I see that we're attempting the rpbbind_getaddr call earlier than we need to, and doing so even when we will not use the handle. This is part of what's causing us to run out of threads, because at the point where we block trying to setup this handle, the rpc service thread has not yet "detached" itself from the RPC service thread pool. RPC threads are not supposed to block (for a long time) until they have detached. That's why we run out of threads.
I'll move that RPC client handle setup (which does the rpcbind_getaddr) call later, doing it only when we know we need to call back the client (i.e. for a lock call that must block).
This should allow non-contested lock calls to proceed without attempting to setup the callback handle.
There were also problems with reporting errors when a call-back fails, and cleaning up state.
This also turns out to dramatically improve locking performance over NFS when the lock request is non-contested.
Out for review:
https://illumos.org/rb/r/2545/
Updated by Gordon Ross 9 months ago
Testing: Run Linux NFS client from behind a firewall that prevents any call-back RPC.
Verified uncontested locks can be taken without call-back RPC (these used to fail).
Verified the call-back failures are reported, and lock state is cleaned up when that happens.
General regression tests via the Connectathon NFS lock tests.
Updated by Dan McDonald 9 months ago
We had been seeing in Joyent unusual NFS problems with an LX NFS client (using a SmartOS native-assistant lx_lockd, but actual Linux rpcbind and/or rpc.statd), but those both seem to dissipate in the presence of a combination of a SmartOS lx-brand fix, along with this fix on the server end.
Updated by Electric Monk 8 months ago
- Status changed from Pending RTI to Closed
- % Done changed from 0 to 100
git commit 54b3584e66419f0f60689044f13d702660b4233f
commit 54b3584e66419f0f60689044f13d702660b4233f Author: Gordon Ross <gwr@nexenta.com> Date: 2020-05-09T23:08:44.000Z 12681 Excess service times for NFS(v3) locking calls Reviewed by: Evan Layton <evan.layton@nexenta.com> Reviewed by: Dan McDonald <danmcd@joyent.com> Approved by: Richard Lowe <richlowe@richlowe.net>