Bug #6476
closedTaking exported_lock RW_READER lock in common_dispatch() can cause deadlock in nfssrv
0%
Description
RWLOCK(9F)says
"NOTE: It is a programming error for any thread to acquire an rwlock as
RW_READER that it already holds. Doing so can deadlock the system: if
thread R acquires the lock as RW_READER, then thread W tries to acquire
the lock as a writer, W will set write-wanted and block. When R tries
to get its second read hold on the lock, it will honor the write-wanted
bit and block waiting for W; but W cannot run until R drops the lock.
Thus threads R and W deadlock. To opt out of this behavior -- that is,
to safely allow a lock to be grabbed recursively as a reader -- the
lock should be acquired as RW_READER_STARVEWRITER, which will allow R
to get its second read hold without regard for the write-wanted bit set
by W. Note that the RW_READER_STARVEWRITER behavior will starve
writers in the presence of infinite readers; it should be used with
care, and only where the default RW_READER behavior is unacceptable."
Here is an example:
> ::stacks -v -m nfssrv mdb: stacks: processing kernel threads mdb: stacks: 128 unique stacks / 3449 threads mdb: stacks: done THREAD STATE SOBJ COUNT ffffff13925ad3c0 SLEEP RWLOCK 296 swtch+0x141 turnstile_block+0x21a rw_enter_sleep+0x236 checkexport+0x3b common_dispatch+0x2b2 rfs_dispatch+0x2d svc_getreq+0x1c1 svc_run+0xe0 svc_do_run+0x8e nfssys+0x111 _sys_sysenter_post_swapgs+0x153 ffffff13df1a93c0 SLEEP RWLOCK 16 swtch+0x141 turnstile_block+0x21a rw_enter_sleep+0x236 checkexport+0x3b nfs_vptoexi+0x168 nfs_getfh+0x10c stubs_common_code+0x59 nfssys+0x3c5 _sys_sysenter_post_swapgs+0x153 ffffff008b9cac40 SLEEP CV 1 swtch+0x141 cv_wait+0x70 nfsauth_refresh_thread+0x21d thread_start+8 ffffff13da590be0 SLEEP RWLOCK 1 swtch+0x141 turnstile_block+0x21a rw_enter_sleep+0x19b exportfs+0xedb stubs_common_code+0x59 nfs_export+0x9e zfs_ioc_share+0x43 zfsdev_ioctl+0x4a7 cdev_ioctl+0x39 spec_ioctl+0x60 fop_ioctl+0x55 ioctl+0x9b _sys_sysenter_post_swapgs+0x153 ffffff13df08d860 SLEEP RWLOCK 1 swtch+0x141 turnstile_block+0x21a rw_enter_sleep+0x236 checkexport+0x3b rfs3_rename+0x117 common_dispatch+0x5a0 rfs_dispatch+0x2d svc_getreq+0x1c1 svc_run+0xe0 svc_do_run+0x8e nfssys+0x111 _sys_sysenter_post_swapgs+0x153 > exported_lock::rwlock ADDR OWNER/COUNT FLAGS WAITERS ffffffffc01218a8 READERS=1 B011 ffffff13c4d02040 (R) || ffffff13c8bcaba0 (R) WRITE_WANTED -------+| ffffff13dc737080 (R) HAS_WAITERS --------+ ffffff13e19f54e0 (R) ffffff13dd516520 (R) ffffff13c95a2b00 (R) ffffff13c925a860 (R) ffffff13d62ec080 (R) ffffff13d6260120 (R) ffffff1364d3a020 (R) ffffff13925a4100 (R) ffffff13d4a8a480 (R) ffffff13df0b0740 (R) ffffff13a9e33ae0 (R) ffffff13d468d0c0 (R) ffffff13c8e3f3a0 (R) ffffff13c8c0ec20 (R) ffffff13d4b30020 (R) ffffff13c94aa440 (R) ffffff13df13e7e0 (R) ffffff13d62e80a0 (R) ffffff13db1ed480 (R) ffffff1408d07c20 (R) ffffff13c4d48100 (R) ffffff13d62483a0 (R) ffffff13e68637c0 (R) ffffff13e199a4a0 (R) ffffff13df0503e0 (R) ffffff13da660b00 (R) ffffff13d4b37520 (R) ffffff13c93363c0 (R) ffffff13d62f3b40 (R) ffffff13da665740 (R) ffffff1408cce840 (R) ffffff13e6863420 (R) ffffff13df086140 (R) ffffff13c9435060 (R) ffffff1408cdeb40 (R) ffffff13d4a8c460 (R) ffffff13d5eab860 (R) ffffff1391f9cc20 (R) ffffff13d485c740 (R) ffffff13c8c21bc0 (R) ffffff13df1b8c40 (R) ffffff13e684c800 (R) ffffff13e687d120 (R) ffffff13dbb60160 (R) ffffff13d61c3020 (R) ffffff13d4876860 (R) ffffff13c925c100 (R) ffffff13c8c0e880 (R) ffffff13dc7a6480 (R) ffffff13db1d17a0 (R) ffffff13df098820 (R) ffffff13c59194a0 (R) ffffff13d46c7440 (R) ffffff1408ca44e0 (R) ffffff13d62e8440 (R) ffffff13c0d26ae0 (R) ffffff13df09c080 (R) ffffff13dc724c60 (R) ffffff13dd518c40 (R) ffffff13d62ec420 (R) ffffff13dbcdeb00 (R) ffffff13da668180 (R) ffffff13df0b7160 (R) ffffff13c9327b80 (R) ffffff13e68d7b00 (R) ffffff13e5b3a840 (R) ffffff13d4a8db80 (R) ffffff1364d1dc00 (R) ffffff13e26d9520 (R) ffffff13d48b3440 (R) ffffff13c8c210e0 (R) ffffff13e26bbbc0 (R) ffffff13c58ca060 (R) ffffff13df1ec880 (R) ffffff13d4b7eb80 (R) ffffff1408ccebe0 (R) ffffff13df13e0a0 (R) ffffff139246ac40 (R) ffffff13df091100 (R) ffffff13d4b78460 (R) ffffff13d491e120 (R) ffffff13dc7a9460 (R) ffffff137dda3520 (R) ffffff13e26a1c40 (R) ffffff13c9445760 (R) ffffff13e100f000 (R) ffffff13c4d48be0 (R) ffffff13c8e77b20 (R) ffffff13d6246020 (R) ffffff13df09cb60 (R) ffffff13a9de64c0 (R) ffffff13c9336760 (R) ffffff13d490c160 (R) ffffff13c4d023e0 (R) ffffff13c0d5c0e0 (R) ffffff13df0a9b20 (R) ffffff13c4d02b20 (R) ffffff13c8c0b500 (R) ffffff13e26a5860 (R) ffffff13c9312880 (R) ffffff137d65f460 (R) ffffff13d62e8b80 (R) ffffff13d4910c20 (R) ffffff13c9319100 (R) ffffff13a9e2b760 (R) ffffff13d47ed8c0 (R) ffffff13dc671880 (R) ffffff13c8e77040 (R) ffffff13df0e04c0 (R) ffffff13da6294c0 (R) ffffff13d62f9780 (R) ffffff13c9d8f440 (R) ffffff13d4b2a7a0 (R) ffffff13dc7a60e0 (R) ffffff13c94473a0 (R) ffffff13e5b2d500 (R) ffffff13c577b3a0 (R) ffffff13801d74a0 (R) ffffff13df14c040 (R) ffffff13c4d61160 (R) ffffff13df1bbc20 (R) ffffff13dd507020 (R) ffffff13df1fc0e0 (R) ffffff13d6248ae0 (R) ffffff13df0b0ae0 (R) ffffff13e569e840 (R) ffffff13c4b66480 (R) ffffff13925b4140 (R) ffffff13dd533420 (R) ffffff13db1c7440 (R) ffffff13e69ac100 (R) ffffff13dbcde760 (R) ffffff13c4d56020 (R) ffffff1391fa47c0 (R) ffffff13e101ac00 (R) ffffff13c4cfe7c0 (R) ffffff13dd50f3a0 (R) ffffff13d622c100 (R) ffffff13d48754e0 (R) ffffff13a9df0c40 (R) ffffff13e687d4c0 (R) ffffff13db1ccb60 (R) ffffff13df212060 (R) ffffff13c95c0440 (R) ffffff13d61ff0c0 (R) ffffff13d4b303c0 (R) ffffff13c4d68140 (R) ffffff1391f91b40 (R) ffffff1364cee480 (R) ffffff13d48b9b40 (R) ffffff13dc713020 (R) ffffff13db1edbc0 (R) ffffff13e5b2c180 (R) ffffff13df09b0c0 (R) ffffff13e19e6400 (R) ffffff13d61ff800 (R) ffffff13dd53cb40 (R) ffffff13c4b634c0 (R) ffffff13c4b61c20 (R) ffffff13c4cfbb80 (R) ffffff13c58ca400 (R) ffffff13c9315120 (R) ffffff13d62117c0 (R) ffffff1391f8ab80 (R) ffffff13c9c11400 (R) ffffff13e19b3040 (R) ffffff13d62f37a0 (R) ffffff13c574bb00 (R) ffffff13df0ee480 (R) ffffff13ab04d000 (R) ffffff13dbb76120 (R) ffffff13e19e4b60 (R) ffffff13c58d9780 (R) ffffff13df1df3a0 (R) ffffff13c577b740 (R) ffffff13df14c780 (R) ffffff1391fa3b80 (R) ffffff13a9db0100 (R) ffffff13e101a4c0 (R) ffffff13e68627e0 (R) ffffff13c4d5a8c0 (R) ffffff13c4d48840 (R) ffffff13dd50f740 (R) ffffff13df0e0860 (R) ffffff13c0d2ac20 (R) ffffff1408cdcb60 (R) ffffff1408cce100 (R) ffffff13e26c77e0 (R) ffffff13d622c4a0 (R) ffffff13ab040020 (R) ffffff137d87d0e0 (R) ffffff13d4910880 (R) ffffff13d47cebe0 (R) ffffff13c0d29160 (R) ffffff13c9336020 (R) ffffff13c8bd8b60 (R) ffffff13d487bbc0 (R) ffffff13d62010a0 (R) ffffff137d65f800 (R) ffffff13dc713b00 (R) ffffff13e100f3a0 (R) ffffff13df199ba0 (R) ffffff13c4b61880 (R) ffffff13c4b64be0 (R) ffffff13aad443e0 (R) ffffff13df1990c0 (R) ffffff13801de7c0 (R) ffffff137d87d820 (R) ffffff13df00c520 (R) ffffff13d47c7c20 (R) ffffff13e68734e0 (R) ffffff13c9547140 (R) ffffff13d4853b00 (R) ffffff13da671140 (R) ffffff13e19a6b40 (R) ffffff1391fa6400 (R) ffffff13d484b400 (R) ffffff13d47ce840 (R) ffffff13c4b63120 (R) ffffff13dc600400 (R) ffffff13d6256c40 (R) ffffff13c4b63860 (R) ffffff13df2133e0 (R) ffffff13c58f60a0 (R) ffffff13c4b608a0 (R) ffffff13c4b6a7e0 (R) ffffff13c58e3820 (R) ffffff13e1017c20 (R) ffffff13df09bba0 (R) ffffff13dc51fb20 (R) ffffff137d69ebe0 (R) ffffff13c9599b60 (R) ffffff13d6246b00 (R) ffffff13e26d4000 (R) ffffff13df13eb80 (R) ffffff13c0d5a120 (R) ffffff13c4bf67e0 (R) ffffff13e19eb020 (R) ffffff13c4b64100 (R) ffffff13924847c0 (R) ffffff13c4bbb740 (R) ffffff13dc7a6820 (R) ffffff13dbb60500 (R) ffffff13e0ffb760 (R) ffffff13dc7133c0 (R) ffffff13c4b6a0a0 (R) ffffff13d95fa7e0 (R) ffffff13c4b5f180 (R) ffffff13d48c7c60 (R) ffffff13e687d860 (R) ffffff13d4888460 (R) ffffff13c4b61140 (R) ffffff13e6873c20 (R) ffffff13df086880 (R) ffffff13c4b69ba0 (R) ffffff13c4b6a440 (R) ffffff13a9de6860 (R) ffffff13e68cc3e0 (R) ffffff1391f8a7e0 (R) ffffff13c4b66bc0 (R) ffffff13c4b60500 (R) ffffff13e4a5c160 (R) ffffff13e4a60c20 (R) ffffff13c47f8760 (R) ffffff13e5b467e0 (R) ffffff13df08d860 (R) ffffff13d490c500 (R) ffffff1408d0a120 (R) ffffff13c8c0bc40 (R) ffffff13c0d5bbe0 (R) ffffff13c4cfb7e0 (R) ffffff13db1f2460 (R) ffffff13925a4be0 (R) ffffff13c932fb20 (R) ffffff13df17e740 (R) ffffff13d4b3e4e0 (R) ffffff13c9312c20 (R) ffffff13df08d4c0 (R) ffffff13e19b33e0 (R) ffffff13c58e30e0 (R) ffffff13c9329080 (R) ffffff13c926cba0 (R) ffffff13c9c117a0 (R) ffffff13925044a0 (R) ffffff13925ae000 (R) ffffff13925a7ba0 (R) ffffff13925acb20 (R) ffffff137ff4d140 (R) ffffff1364ce3be0 (R) ffffff1391fa37e0 (R) ffffff137d1a8ae0 (R) ffffff134e0cc4c0 (R) ffffff13925ac040 (R) ffffff13925ad3c0 (R) ffffff13c4ba1420 (R) ffffff13da590840 (R) ffffff13e19eb3c0 (R) ffffff13e69b80c0 (R) ffffff13e5b20760 (R) ffffff13df1f2100 (R) ffffff13df0a93e0 (R) ffffff13d4910140 (R) ffffff13925004c0 (R) ffffff13c58e90c0 (R) ffffff137519b0e0 (R) ffffff13d4cbec60 (R) ffffff1392504100 (R) ffffff13df0e0120 (R) ffffff13df0ea4a0 (R) ffffff13df1a93c0 (R) ffffff13da590be0 (W)
The following thread grabs reader lock(exported_lock) at http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/nfs/nfs_server.c#1668 in common_dispatch(), but checkexport() tries to grab the same reader lock again, thus blocked due to thread ffffff13da590be0 trying to grab a writer lock in exportfs().
ffffff13df08d860 SLEEP RWLOCK 1 swtch+0x141 turnstile_block+0x21a rw_enter_sleep+0x236 checkexport+0x3b rfs3_rename+0x117 common_dispatch+0x5a0 rfs_dispatch+0x2d svc_getreq+0x1c1 svc_run+0xe0 svc_do_run+0x8e nfssys+0x111 _sys_sysenter_post_swapgs+0x153
It seems this issue was introduced by https://www.illumos.org/issues/6090.
Related issues
Updated by Marcel Telka about 8 years ago
- Related to Bug #6090: IOPS, bandwidth, and latency kstats for NFS server added
Updated by Youzhong Yang almost 8 years ago
Hi Marcel,
Did you get a chance to look at it? We hit this issue frequently, so I tried the following fix and it seems working.
diff --git a/usr/src/uts/common/fs/nfs/nfs_server.c b/usr/src/uts/common/fs/nfs/nfs_server.c index 5d2efc7..781a7a6 100644 --- a/usr/src/uts/common/fs/nfs/nfs_server.c +++ b/usr/src/uts/common/fs/nfs/nfs_server.c @@ -1688,6 +1688,7 @@ common_dispatch(struct svc_req *req, SVCXPRT *xprt, rpcvers_t min_vers, mutex_enter(exi_ksp->ks_lock); kstat_runq_enter(KSTAT_IO_PTR(exi_ksp)); mutex_exit(exi_ksp->ks_lock); + rw_exit(&exported_lock); } else { rw_exit(&exported_lock); } @@ -1896,6 +1897,7 @@ done: } if (exi_ksp != NULL) { + rw_enter(&exported_lock, RW_READER); mutex_enter(exi_ksp->ks_lock); KSTAT_IO_PTR(exi_ksp)->nwritten += pos; KSTAT_IO_PTR(exi_ksp)->writes++;
Does it look good to you?
Thanks,
Updated by Marcel Telka almost 8 years ago
With the current design the exported_lock holding is essential to make sure the kstats (in exi_ksp) are not altered in the middle. I'm working on significant enhancement and the complete redesign of the kstats implementation I introduced in #6090, so the exported_lock won't be needed to guarantee the consistency.
Updated by Marcel Telka almost 8 years ago
- Status changed from New to In Progress
Updated by Marcel Telka almost 8 years ago
- Related to Bug #6696: Per-client NFS server IOPS, bandwidth, and latency kstats added
Updated by Marcel Telka about 7 years ago
- Status changed from In Progress to Feedback
This bug was introduced by #6090, which was recently backed out.
Updated by Marcel Telka about 7 years ago
- Status changed from Feedback to Closed