Project

General

Profile

Actions

Bug #13848

closed

nfssrv: excessive crdup/crfree cause bottleneck after fixing 6770

Added by Vitaliy Gusev 6 months ago. Updated 5 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

crget/crdup/crfree are heavy as they calls zone_cred_hold/zone_cred_fini functions and those takes zone's mutex. Effect is like global mutex is taken.

Two socket systems show significant nfs-server performance bottleneck when testing with 10GbE network.

NFSv3, NFSv4 protocols are affected. To check to run dtrace script and look correlation between calls and calling crget/crdup functions :

dtrace -n 'zone_cred_hold:entry,zone_cred_rele:entry{@a=count();} common_dispatch:entry{@r=count()} tick-10s{printa(@a,@r); trunc(@a); trunc(@r);}'

Commit

commit 12fb3699cf98503685902fe0309c546343340e61
Author: Marcel Telka <marcel.telka@nexenta.com>
Date:   Wed Mar 23 01:30:16 2016 +0100

6770 nfsauth_retrieve() flood caused by NFS clients with personal identity problems

calls crdup/crfree in the fast path, that should be optimised.


Related issues

Related to illumos gate - Bug #6770: nfsauth_retrieve() flood caused by NFS clients with personal identity problemsClosedMarcel Telka2016-03-20

Actions
Actions #1

Updated by Electric Monk 6 months ago

  • Gerrit CR set to 1528
Actions #2

Updated by Vitaliy Gusev 6 months ago

  • Related to Bug #6770: nfsauth_retrieve() flood caused by NFS clients with personal identity problems added
Actions #3

Updated by Vitaliy Gusev 6 months ago

For NFSv4 code.

With patched NFSv4.x code shows ~10% increasing iops. For the instance,

on 2-socket system Xeon(r) CPU E5-2643 v4 @ 3.40GHz, total 24 HT cores and 2x25GbE 120GB RAM, 1KB reads:

~ 464 000 iops w/o patch
~ 500 000 iops with patch

Note, that original NFSv4.x calls crdup/crget ~ 3x times per request. Patch eliminates one call and 2x calls still exists in nfsv4 code.

For NFSv3 code.

Before patch:

# dtrace -n '::common_dispatch:entry,::crget:entry,::crdup:entry{@[probefunc]=count();}'
^C
  crget                                                             5
  common_dispatch                                               17738
  crdup                                                         17738

After patch the same dtrace script for NFSv3 shows that almost no crget/crdup calls :

  crget                                                             4
  common_dispatch                                               27044
Actions #4

Updated by Vitaliy Gusev 5 months ago

DEBUG build works w/o panics. And kmem-cache shows no increasing creds:

::kmem_slabs!grep cred
cred_cache                      15        1       285         4   1.4%
Actions #5

Updated by Electric Monk 5 months ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit a547d3069fbb76f7603ab6fe082827b54e008a3e

commit  a547d3069fbb76f7603ab6fe082827b54e008a3e
Author: Vitaliy Gusev <gusev.vitaliy@gmail.com>
Date:   2021-06-29T15:01:40.000Z

    13848 nfssrv: excessive crdup/crfree cause bottleneck after fixing 6770
    Reviewed by: Jason King <jason.brian.king@gmail.com>
    Reviewed by: Patrick Mooney <pmooney@pfmooney.com>
    Reviewed by: Marcel Telka <marcel@telka.sk>
    Approved by: Dan McDonald <danmcd@joyent.com>

Actions

Also available in: Atom PDF