Bug #6225


NFSv4: setlock() can spin forever

Added by Marcel Telka about 7 years ago. Updated almost 7 years ago.

nfs - NFS server and client
Start date:
Due date:
% Done:


Estimated time:
Gerrit CR:
External Bug:


Here is the simplified setlock() implementation:

        int error;

        for (...) {
                error = VOP_FRLOCK(F_SETLK, ...);

        if (error) {
                VOP_FRLOCK(F_GETLK, ...);
                if (F_UNLCK)
                        goto retry;

        return (error);

The setlock() just tries to get the lock in the for loop. If it is not successful it tries to find who is preventing us to get the lock. If it finds that there is no pending lock, it retries (goto retry). If we are extremely unlucky, some other process might always hold a lock when we do VOP_FRLOCK(F_SETLK), and it might always release such a lock just before we do VOP_FRLOCK(F_GETLK). This would cause the setlock() to spin forever. We should make sure this extremely unlucky scenario cannot happen.

In addition to the extremely unlucky (but possible) scenario described above the setlock() spin could be reproduced far easier:

The current core file locking implementation is write-preferring. This means that if there is a read lock active (process 1) and one other write lock is waiting (process 2) for the read lock, the additional read lock request (process 3) will fail (or block). In a case the process 3 will try to get the information about the lock preventing us from getting the read lock (using F_GETLK), the F_UNLCK is returned. This will cause that setlock() to spin forever too. Please see Bug #5957 for discussion about write-preferring vs. read-preferring locking and for an example about the described behavior.

There are several problems caused by the spinning setlock():

  1. The setlock() thread holds the exported_lock as reader - see rfs4_compound(). This means that during the setlock() spin all attempts to acquire the exported_lock as writer will block and wait until the setlock() thread stops to spin. For example every attempt to share or unshare a filesystem will hang.
  2. The NFSv4 client won't see the reply to its NFSv4 lock call (since the particular thread responsible for sending the reply is spinning in setlock()).

Steps to reproduce

You need one NFS server with a shared filesystem (let say /export) and one NFS client with the mounted share from the NFS server using the NFSv4 protocol (mount -o vers=4 SERVER:/export /mnt).

You also need to complie the attached lock.c source file in the /export directory:

# gcc -Wall -o lock lock.c

Do this locally on the NFS server in the /export directory:

# ./lock r file &
[1] 718
# sleep 2
# ./lock W file

Do this on the NFS client in the /mnt directory:

# ./lock R file 

Now we can find a thread on the NFS server cycling in setlock():

> ::stacks -c setlock
THREAD           STATE    SOBJ                COUNT
ffffff00d0fb38a0 SLEEP    CV                      1


The thread calls in a cycle VOP_FRLOCK() with F_SETLK (or F_SETLK_NBMAND) and F_GETLK. The VOP_FRLOCK(F_SETLK) will always fail, but the VOP_FRLOCK(F_GETLK) will always succeed and return F_UNLCK. This will cause the infinite spin in setlock().

Now, when you'll try to unshare any exported filesystem, the unshare command will hang with this stack:

> ffffff00c8dd97a0::findstack -v
stack pointer for thread ffffff00c8dd97a0: ffffff0002ebcc80
[ ffffff0002ebcc80 _resume_from_idle+0xf4() ]
  ffffff0002ebccb0 swtch+0x141()
  ffffff0002ebcd50 turnstile_block+0x21a(0, 0, ffffffffc01afc88, fffffffffbc08bc0, 0, 0)
  ffffff0002ebcdc0 rw_enter_sleep+0x19b(ffffffffc01afc88, 0)
  ffffff0002ebd270 unexport+0x28(ffffff00d2943340)
  ffffff0002ebdd40 exportfs+0x25f(ffffff0002ebdda0, 100000, ffffff00d0d72470)
  ffffff0002ebdd80 stubs_common_code+0x51()
  ffffff0002ebddd0 nfs_export+0x9e(8046048)
  ffffff0002ebdec0 nfssys+0x388(2, 8046048)
  ffffff0002ebdf10 _sys_sysenter_post_swapgs+0x149()


lock.c (1.41 KB) lock.c Marcel Telka, 2015-09-12 08:48 AM

Related issues

Related to illumos gate - Bug #5957: File locking should be read-preferringNew2015-05-26

Related to illumos gate - Bug #6253: F_GETLK doesn't always return lock ownerClosedArne Jansen2015-09-21

Actions #1

Updated by Marcel Telka about 7 years ago

  • Related to Bug #5957: File locking should be read-preferring added
Actions #3

Updated by Marcel Telka about 7 years ago

The solution

The NFSv4.0 LOCK operation does not support the blocking lock (at the NFSv4.0
protocol level) so the client needs to resend the LOCK request in a case the
lock is denied by the NFSv4 server. NFSv4.0 clients are prepared for that
(obviously); they are sending the LOCK requests with some delays between the
attempts. See nfs4frlock() and nfs4_block_and_wait() for the locking and delay
implementation at the client side.

To make the life of the clients easier, the illumos NFSv4 server tries to do
some fast retries on its own (the for loop in setlock()) in a hope the lock will be
available soon. And if not, the client won't need to resend the LOCK requests
so fast to check the lock availability. This basically saves some network
traffic and tries to make sure the client gets the lock ASAP.

So the for loop in setlock() makes perfect sense and it is good to have, but it
is completely wrong to allow the setlock() spin forever (goto retry).

The solution is easy: Having the above in mind, make sure the setlock() won't
spin forever, so the "goto retry" is going to be removed.

Actions #4

Updated by Marcel Telka about 7 years ago

  • Status changed from In Progress to Pending RTI
Actions #5

Updated by Arne Jansen about 7 years ago

  • Related to Bug #6253: F_GETLK doesn't always return lock owner added
Actions #6

Updated by Marcel Telka about 7 years ago

Arne Jansen and Gordon Ross requested to preserve the retry loop and limit it to 10 instead. This is the explanation (written by Gordon):

There's a race inherent in the current VOP_FRLOCK design where:
a: "other guy" takes a lock that conflicts with a lock we want
b: we attempt to take our lock (non-blocking) and the attempt fails.
c: "other guy" releases the conflicting lock
d: we ask what lock conflicts with the lock we want, getting F_UNLCK
(no lock blocks us)

If we retry the non-blocking lock attempt in this case (restart at
step 'b') there's some possibility that many such attempts might fail.
However a test designed to actually provoke this race shows that the
vast majority of cases require no retry, and only a few took as many
as three retries.  Here's the test outcome:

        number of retries    how many times we needed that many retries
        0                    79461
        1                      862
        2                       49
        3                        5

Given those empirical results, we arbitrarily limit the retry count to ten.

If we actually make to ten retries and give up, nothing catastrophic
happens, but we're unable to return the information about the
conflicting lock to the NFS client.  That's an acceptable trade off
vs. letting this retry loop run forever.
Actions #7

Updated by Electric Monk almost 7 years ago

  • Status changed from Pending RTI to Closed
  • % Done changed from 0 to 100

git commit 55a4551ddde8feed8f8e011458f992ea74257bd9

commit  55a4551ddde8feed8f8e011458f992ea74257bd9
Author: Marcel Telka <>
Date:   2015-10-07T18:30:42.000Z

    6225 NFSv4: setlock() can spin forever
    Reviewed by: Josef 'Jeff' Sipek <>
    Reviewed by: Gordon Ross <>
    Reviewed by: Garrett D'Amore <>
    Reviewed by: Robert Mustacchi <>
    Reviewed by: Arne Jansen <>
    Approved by: Dan McDonald <>


Also available in: Atom PDF