Project

General

Profile

Bug #6770

nfsauth_retrieve() flood caused by NFS clients with personal identity problems

Added by Marcel Telka over 4 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
nfs - NFS server and client
Start date:
2016-03-20
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

Some NFS clients have problems with their personal identity over AUTH_SYS. We encountered cases when VMware ESXi sent NFS requests with uid/gid 0/0 with changing list of supplemental groups. Sometimes the list of supplemental groups was empty, sometimes the list contained one entry (group 0). In extreme case such "identity switch" happened many times in every second.

The nfsauth_cache_get() implementation is not prepared for such clients. The current design expects rare credential changes. Once the user's list of supplemental groups changes, the cached nfsauth information is flushed and the new nfsauth information is retrieved synchronously from mountd using nfsauth_retrieve(). This might have significant performance impact.

To fix this we should cache all variants of user's identity.

Steps to reproduce

1. Use usr/src/cmd/cmd-inet/usr.sbin/snoop/nfs4_xdr.c from the illumos gate and attached gidschng.c. Compile them to get the gidschng binary:

gcc -Wall -Wno-switch -lnsl nfs4_xdr.c gidschng.c -o gidschng

The gidschng binary will simulate a client with the changing identity (the changing list of supplemental groups).

2. Share root (/) with options that will force the NFS server to ask mountd for the nfsauth info:

share -o rw=foobar /

3. Run the following dtrace script to monitor the nfsauth_retrieve() calls:

dtrace -n 'nfsauth_retrieve:entry{}' &

4. Run gidschng:

./gidschng

You will see a lot of nfsauth_retrieve hits.


Files

gidschng.c (1.51 KB) gidschng.c Marcel Telka, 2016-03-20 01:35 PM

Related issues

Related to illumos gate - Bug #5509: nfsauth_cache_get() could spend a lot of time walking exi_cacheClosedMarcel Telka2015-01-07

Actions
Related to illumos gate - Feature #5296: Support for more than 16 groups with AUTH_SYSClosedMarcel Telka2014-11-07

Actions
#1

Updated by Marcel Telka over 4 years ago

  • Related to Bug #5509: nfsauth_cache_get() could spend a lot of time walking exi_cache added
#2

Updated by Marcel Telka over 4 years ago

Thanks to Sebastien Roy for the initial report and problem analysis.

The problem was initially introduced by #5296.

#3

Updated by Marcel Telka over 4 years ago

  • Related to Feature #5296: Support for more than 16 groups with AUTH_SYS added
#4

Updated by Marcel Telka over 4 years ago

  • Description updated (diff)
#5

Updated by Marcel Telka over 4 years ago

  • Status changed from In Progress to Pending RTI
#6

Updated by Electric Monk over 4 years ago

  • Status changed from Pending RTI to Closed
  • % Done changed from 0 to 100

git commit 12fb3699cf98503685902fe0309c546343340e61

commit  12fb3699cf98503685902fe0309c546343340e61
Author: Marcel Telka <marcel.telka@nexenta.com>
Date:   2016-03-29T20:07:38.000Z

    6770 nfsauth_retrieve() flood caused by NFS clients with personal identity problems
    Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
    Approved by: Richard Lowe <richlowe@richlowe.net>

Also available in: Atom PDF