Bug #4965

NLM: Mac OS X client can be stuck at locking a NFS file on an illumos server (SmartOS, Nexenta or OmniOS etc.) running open source lockd

Added by Youzhong Yang over 6 years ago. Updated almost 6 years ago.

nfs - NFS server and client
Start date:
Due date:
% Done:


Estimated time:
Gerrit CR:


Here is how to reproduce the issue:

Suppose a Mac host has write access to a file /x/y/z located on a illumos nfs server.

- Run the following command to lock and unlock the file, it will succeed

  env FILE=/x/y/z perl -e 'use Fcntl qw(:DEFAULT :flock O_WRONLY LOCK_EX LOCK_UN); my ($fh, $start, $elapsed); if(sysopen ($fh, $ENV{"FILE"}, O_WRONLY)){$start = time; flock($fh, LOCK_EX); flock($fh, LOCK_UN);close $fh; $elapsed = time - $start; print "$elapsed seconds\\\\n";} else {print "ERROR: Could not open file: $!\\\\n";}'

- Reboot the Mac host
- Run the above command again, it will be stuck, spewing lots of 'lockd not responding' message
  nfs server nfs_server_name:nfs_share_path: lockd not responding
  nfs server nfs_server_name:nfs_share_path: lockd not responding

It never ends and you cannot even kill the perl command.


Updated by Youzhong Yang over 6 years ago

Don't know what happened to the bug description. Here is the perl command:

env FILE=/x/y/z perl -e 'use Fcntl qw(:DEFAULT :flock O_WRONLY LOCK_EX LOCK_UN); my ($fh, $start, $elapsed); if(sysopen ($fh, $ENV{"FILE"}, O_WRONLY)){$start = time; flock($fh, LOCK_EX); flock($fh, LOCK_UN);close $fh; $elapsed = time - $start; print "$elapsed seconds\\n";} else {print "ERROR: Could not open file: $!\\n";}'


Updated by Youzhong Yang over 6 years ago

Issue analysis:

On Mac OS X, lockd uses UDP by default, and its UDP port is not specified so the OS will assign any available port when the machine reboots.

In usr/src/uts/common/klm/nlm_rpc_handle.c, refresh_nlm_rpc() uses nlm_null_rpc() to check if the client host's RPC binding is still fresh.
static int
refresh_nlm_rpc(struct nlm_host *hostp, nlm_rpc_t *rpcp)
        enum clnt_stat stat;
        stat = nlm_null_rpc(rpcp->nr_handle, rpcp->nr_vers);
        if (NLM_STALE_CLNT(stat)) {
            ret = ESTALE;

After a Mac host is rebooted and its lockd gets a different UDP port, and then NLM on the server tries to send null_rpc to the old port, the issue happens.

nlm_null_rpc() will return RPC_TIMEDOUT for UDP if the port is not reachable but NLM_STALE_CLNT is defined as follows in usr/src/uts/common/klm/nlm_rpc_handle.c:
#define    NLM_STALE_CLNT(_status)            \\
    ((_status) == RPC_PROGUNAVAIL ||    \\
    (_status) == RPC_PROGVERSMISMATCH ||    \\
    (_status) == RPC_PROCUNAVAIL ||        \\
    (_status) == RPC_CANTCONNECT ||        \\
    (_status) == RPC_XPRTFAILED)

So refresh_nlm_rpc() never gets a chance to return ESTALE to its caller nlm_host_get_rpc().

The following dtrace can show that nlm4_null_4() returns code 5(RPC_TIMEDOUT) after 150 seconds:
dtrace -n 'fbt::nlm4_null_4:entry {self->t=timestamp;} fbt::nlm4_null_4:return /arg1 != 0 && self->t != 0/ { printf("ret = %d, elapsed = %d, now = %Y\\n", arg1, timestamp - self->t, walltimestamp); self->t = 0;}'

The stack looks like:

I made the following two changes and built a new image, the issue goes away.

--- a/usr/src/uts/common/klm/nlm_impl.c
+++ b/usr/src/uts/common/klm/nlm_impl.c
@@ -525,6 +525,12 @@ nlm_clnt_call(CLIENT *clnt, rpcproc_t procnum, xdrproc_t xdr_args,
        if (procnum >= NLM_TEST_RES && procnum <= NLM_GRANTED_RES)
                wait = nlm_rpctv_zero;
+       if (procnum == NLM_NULL) {
+               wait.tv_sec = 0;
+               wait.tv_usec = 25000;
+       }

--- a/usr/src/uts/common/klm/nlm_rpc_handle.c
+++ b/usr/src/uts/common/klm/nlm_rpc_handle.c
@@ -55,6 +55,7 @@
        (_status) == RPC_PROGVERSMISMATCH ||    \\
        (_status) == RPC_PROCUNAVAIL ||         \\
        (_status) == RPC_CANTCONNECT ||         \\
+       (_status) == RPC_TIMEDOUT ||            \\
        (_status) == RPC_XPRTFAILED)

Setting timeout value of NULL rpc to 25 milliseconds instead of the default 25 seconds can make nlm_null_rpc() returns RPC_TIMEDOUT after 1575 milliseconds when the UDP port is not reachable:
   1575 = 25 + 50 + 100 + 200 + 400 + 800 => 5 retries

I would appreciate any comments/advices leading to a carefully crafted better fix for this issue.

Updated by Michel Dionne about 6 years ago

One workaround is to use the osx client option
nfs.lockd.send_using_tcp = 1
in the nfs.conf file.

with this option when the osx client reboots, the tcp connections are properly closed and once back online, the locking works normally again.


Updated by Youzhong Yang about 6 years ago

We've tried send_using_tcp on mac, lock/unlock performance is awful.

Our workaround is to set nfs.lockd.port in /etc/nfs.conf so that the port is fixed.


Updated by Michel Dionne almost 6 years ago

This seems to help

nfsv3 between osx and Nexenta.

A)Optimize the nfs clients performance for OSX
Create the file /etc/nfs.conf with the following contents:
nfs.client.nfsiod_thread_max = 16
nfs.client.mount.options = rw,noatime,bg,tcp,resvport,intr,vers=3,rwsize=65536
nfs.lockd.port = 777

Also available in: Atom PDF