Bug #4271


NFSv4 server: delegations should be recalled before the nbl_conflict() call

Added by Marcel Telka over 8 years ago.

nfs - NFS server and client
Start date:
Due date:
% Done:


Estimated time:
Gerrit CR:


Delegation support in NFSv4 allows the NFS server to delegate some responsibilities to NFS clients. If such delegation is granted the NFS client might handle some OPEN/CLOSE (and other) operations locally without notifying the NFS server.

This might lead to a situation when the NFS server doesn't know whether a particular file is opened (and how) or not. Such knowledge is only at the NFS client side. With the delegation granted, the NFS client could hold a file open (at the NFS server) also in a case the file is no longer opened by any process at the client. Similarly, the NFS client is not required to send the OPEN operation to the NFS server when a local process (at NFS client) opens the file.

In a case some (potentially) conflicting operation is going to be performed at the NFS server and the NFS server needs to know the real situation about the file, the NFS server will ask the NFS client with the delegation granted to update all information about the file at the NFS server (recall the delegation).

After the delegation is recalled the NFS server might try to continue with the potentially conflicting operation. The operation then might fail or succeed based on the real status now known at the NFS server. RENAME and REMOVE are examples of such operations.

For zfs filesystem (this is just an example, other filesystems are affected too) both operations are implemented in zfs_rename() and zfs_remove().

The unlink(2) syscall calls vn_removeat(). Here some basic checking is done, for example a test for share reservation (in case nbmand=on, which is not default):

1916    if (nbl_need_check(vp)) {
1917        nbl_start_crit(vp, RW_READER);
1918        in_crit = 1;
1919        if (nbl_conflict(vp, NBL_REMOVE, 0, 0, 0, NULL)) {
1920            error = EACCES;
1921            goto out;
1922        }
1923    } else {
1924        VN_RELE(vp);
1925        vp = NULL;
1926    }

If there is no outstanding shared reservation blocking this particular operation (NBL_REMOVE in our case), the flow continues and (in general) VOP_REMOVE()/zfs_remove() is called.

zfs_remove() after initial sanity checks notifies via the vnevent_remove() call that the remove operation is in progress. The NFS server is hooked to these notifications (in a case there is delegation granted for this file to some NFS client).

Based on this notification the NFS server will recall the delegation to synchronize the client's view of the file status (opened? how opened?) with the information on the NFS server. Once the delegation is returned the NFS server returns back and zfs_remove() could continue.

The problem is that the reservation check for the remove operation has been already done (see above), and now, once the potential delegation has been returned, the share reservation status of the file might be different and the file shouldn't be deleted (because the file is opened with the share reservation). Unfortunately, the zfs_remove() does not perform additional check for the share reservation once the notification has been sent, so it will continue, and very likely succeed with the file removal.

The above is one possible scenario of the failure (zfs will delete a file with outstanding share reservation). The other possible case might be when the NFS client didn't send the CLOSE to the NFS server once the file has been closed by an application at the client (the NFS client is not required to send CLOSE, because the file is delegated to the client).

So the NFS server is seeing the file as opened with a share reservation (and a delegation granted to the client). Once there is an unlink(2) request for the file, the vn_removeat() will first check for the share reservation (see above). This check will fail, so the unlink(2) will return with EACCESS.
Even the file in reality is not opened (but only NFS client knows that, and NFS server didn't tried to recall the delegation). This will lead to undeletable file with no obvious reason (there is no application holding the file opened).

The proper order of events should be that the share reservation check (the nbl_conflict() call) needs to be done after the vnevent_remove() call. The similar issue as we described for zfs_remove() is with zfs_rename() (and maybe others) too.

AFAIK, this issue is not reproducible using the illumos NFS client because our NFS client sends all OPEN and CLOSE operations to the server (even the delegation is granted and the NFS client is not obliged to do so), so the NFS server is always aware about the current open status of the file. But there might be some NFSv4 client implementation, not so eager with OPENs and CLOSEs, where this might (and will) cause a problem.

Related issues

Related to illumos gate - Bug #13664: nbl_conflict() calls in vnode.c are racyNew

Related to illumos gate - Bug #13665: rename with non-regular target over NFS generates .nfsXXXX filesNew

Actions #1

Updated by Marcel Telka over 1 year ago

  • Related to Bug #13664: nbl_conflict() calls in vnode.c are racy added
Actions #2

Updated by Marcel Telka over 1 year ago

  • Related to Bug #13665: rename with non-regular target over NFS generates .nfsXXXX files added

Also available in: Atom PDF