Project

General

Profile

Bug #4768

Handling of rfs4_start_time is fragile

Added by Marcel Telka over 5 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
nfs - NFS server and client
Start date:
2014-04-16
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

There are basically two issues with handling of rfs4_start_time.

Issue 1

When the rfs4_start_time is set in rfs4_state_init(), there is a scenario where the rfs4_start_time could be set to a value which was used by one of the previous nfs server instances. The scenario requires three nfsd restarts in less than one second.

1183    /*
1184     * Set the boot time.  If the server
1185     * has been restarted quickly and has had the opportunity to
1186     * service clients, then the start_time needs to be bumped
1187     * regardless.  A small window but it exists...
1188     */
1189    if (rfs4_start_time != gethrestime_sec())
1190        rfs4_start_time = gethrestime_sec();
1191    else
1192        rfs4_start_time++;

During the first nfsd restart the rfs4_start_time is set to current time (let say T) in rfs4_state_init().

During the second restart of nfsd, the rfs4_start_time is bumped to T+1, because we restarted nfsd in the same second as the previous nfsd restart, so the condition at line 1189 (see above) is false.

During the third restart of nfsd, the condition at line 1189 is true (current time is T, rfs4_start_time is T+1), so we will set rfs4_start_time back to T, as we had after the first nfsd restart.

To fix this we need to change "not equal" to "less than" at line 1189.

Issue 2

The rfs4_start_time is used in what_stateid_error() to check whether the supplied stateid is from the current, or previous (or some other) nfs instance. In a case the supplied stateid is accidentally from some other (and newer) nfs instance, the following check will pass and NFS4ERR_STALE_STATEID won't be returned. NFS4ERR_BAD_STATEID will be returned instead.

3271    /* From a previous server instantiation, return STALE */
3272    if (id->bits.boottime < rfs4_start_time)
3273        return (NFS4ERR_STALE_STATEID);

...

3293
3294    return (NFS4ERR_BAD_STATEID);

In general, NFS4ERR_BAD_STATEID should be returned only when the stateid was generated by the current server instance which is not our case here (boottime does not match rfs4_start_time).

To fix this we need to change "less than" at line 3272 above to "not equal".

History

#1

Updated by Marcel Telka over 5 years ago

  • Status changed from New to In Progress
  • Assignee set to Marcel Telka
#2

Updated by Marcel Telka over 5 years ago

  • Status changed from In Progress to Pending RTI
#3

Updated by Electric Monk over 5 years ago

  • Status changed from Pending RTI to Closed
  • % Done changed from 0 to 100

git commit 25a1318c7b398ea2cc4a31ea6d3fa5c1ffea91b9

commit  25a1318c7b398ea2cc4a31ea6d3fa5c1ffea91b9
Author: Marcel Telka <marcel.telka@nexenta.com>
Date:   2014-05-10T22:46:36.000Z

    4768 Handling of rfs4_start_time is fragile
    Reviewed by: Dan McDonald <danmcd@omniti.com>
    Reviewed by: Jan Kryl <jan.kryl@nexenta.com>
    Approved by: Richard Lowe <richlowe@richlowe.net>

Also available in: Atom PDF