Support for remote stale lock detection
The support for locking over NFSv2/3 is implemented using the sideband NLM protocol (with the support of the NSM protocol). Unfortunaly, there are some scenarios when the NLM server and the NLM client might run out of sync with their knowledge of the locks. While the most such cases are just a bugs either in the NLM server or NLM client implementation, there are some valid
problematic scenarios that are not caused by any bug and they are just inherent flaws in the NFS/NLM/NSM protocol design.
In a case the NLM client and NLM server are out of sync with their record of the locks, we could see a case when the NLM (NFS) server thinks a file is locked by some client, but such a client thinks it does not hold such a lock. If there is some other client asking for the conflicting lock we will see that the other client won't be able to get the lock and it might wait (block) forever. For these scenarios the operating system contains the clear_locks(1m) tool to help the administrator to clear such "stale" locks.
The problem with the clear_locks(1m) usage is that it is very hard for the administrator to find such a "stale" lock candidate for the cleanup. Basically, there is no help from the operating system itself to make the admin's life easir. This change adds the support for the "stale" locks detection, so the admin will be able easily find a "stale" lock candidate for the clearing.
The implementation checks for the remote stale locks only (NLM, NFSv4, SMB) with the focus on the NLM, since the other two sharing protocols (NFSv4 and SMB) does not have known scenarios with the stale locks. The local locks are not checked, but if needed the implementation could be easily modified to include the check for local stale locks too.
The design idea¶
The lock_descriptor_t structure will get a new member (l_blocker):
90 hrtime_t l_blocker; /* time when this lock */ 91 /* started to prevent other */ 92 /* locks from being set */
By default, it will contain 0.
Once there is a lock request (either blocking or non-blocking - either F_SETLKW or F_SETLK) that cannot be fulfilled because there is some conflicting lock active (let name it act_lock), we will do:
act_lock->l_blocker = gethrtime();
So we will mark the act_lock as a blocker to some other lock request. This will also add a mark how long the blocker blocks other lock requests.
Blocking lock request¶
We will change the wait_for_lock() to timeout after some configured time (let name it stale_lock_timeout). After the timeout, the wait_for_lock() will check what is the current active lock that is blocking us (act_lock) and check the act_lock->l_blocker is far enough in past (at least stale_lock_timeout). If so, we will log the "stale lock" message (and mark the act_lock as "already
reported"). If not, we will just wait again for the lock.
Non-blocking lock request¶
Once there is a new non-blocking lock request that cannot be fulfilled because there is some active blocker (act_lock) and
gethrtime() - act_lock->l_blocker > stale_lock_timeout, we will log the "stale lock" message (and mark the act_lock as "already reported").
In a case the active and reported lock blocker is released, we will log "stale lock released" message, so the admin could pair the "stale lock" warning with it and see that the "stale lock" situation is no longer pending.
All of the above might be implemented for all locks (for example as a debug support for local application developers), or limited to remote locks (blockers) only.