New kstats for NFS server and xdrmblk_getpos fix

Review Request #60 — Created June 16, 2015 and submitted

marcel
illumos-gate
5907, 6090
general

webrev: http://cr.illumos.org/~webrev/marcel/il-nfs-kstat/

This is implementation of the detailed IOPS, bandwidth, and latency kstats for NFS server. The kstats are two level: one "global" for the NFS server and broken down per share.

Together with the kstats implementation there is a fix in the xdrmblk implementation which is a prerequisite for the kstat work.

I ran the debug bits on the NFS server for several hours with various tests to
make sure everything works as exepcted. I checked all of the kstats (for all
NFS and NFS_ACL protocol versions) are properly updated. I also make sure that
the re-share works as expected and the kstats are properly persisted.

As a sanity check I also ran the debug bits on NFS cleint just to make sure
there is no regression.

After every test run I generated the crash dump file and checked it for leaks
using the ::findleaks mdb command. This testing helped me during the
development to realize that the xdrmblk users usually does not call the
XDR_DESTROY() for the used XDR stream. I checkeck (and fixed) all of the
xdrmblk users to make sure the XDR stream is properly destroyed/freed.

Using the ::findleaks mdb command I confirmed that there are no new memory
leaks introduced by the change.

  • 0
  • 0
  • 5
  • 1
  • 6
Description From Last Updated
marcel
marcel
richlowe
  1. Am I correct in reading that these per-share stats aren't persistent across share destroy/recreation?

    1. Yes, correct. They are persistent on re-share only (share of already shared path).

  2. Would a size_t or an explicitly sized type (uint32_t or whatever), be better?

  3. usr/src/uts/common/fs/nfs/nfs4_srv.c (Diff revision 1)
     
     

    "we cannot associate it with any exportinfo", I think

  4. usr/src/uts/common/fs/nfs/nfs4_srv.c (Diff revision 1)
     
     

    "associate" again, I think? And in all similar comments.

  5. usr/src/uts/common/fs/nfs/nfs_export.c (Diff revision 1)
     
     

    is there a reason to spell "compare" without the 'e'? Prevailing style in NFS maybe?

    1. I think I borrowed it from usr/src/uts/common/sys/avl.h

  6. usr/src/uts/common/fs/nfs/nfs_server.c (Diff revision 1)
     
     

    size_t or integer type with specified size?

  7. usr/src/uts/common/nfs/nfs4_kprot.h (Diff revision 1)
     
     

    Again, why not size_t. I'm suspecting the more I run into these that you have a reason :)

    1. This is because xdr_getpos() returns uint_t. But you are right, I'll change it to size_t.

  8. usr/src/uts/common/rpc/svc_cots.c (Diff revision 1)
     
     

    Yow

  9. 
      
relling
  1. Thanks Marcel!
    General comments, not really suitable for inline review:

    I'd like to see a class other than "misc"
    My first reaction is to use "disk" which will get automatically picked up by iostat. However, this generates way too much noise for iostat's traditional use case.
    I propose a new class "nfsops"(?) to allow someone to build an extensible iostat-like tool.

    Overloading state into the name is painful. Rather than having:
    nfs:2:share:path / (pseudo)

    it is better to add another name indicating (pseudo) so that consumers don't have to parse the stat looking for the " (pseudo)" token. I'm inclined towards a the name "filesystem", rather than a boolean, to be more consistent with rfc 3530 section 7.3:
    nfs:2:share:filesystem pseudo|real

    1. Re the new "nfsops"(?) class:

      I can do this change, but I'm not sure the "nfsops" is the best fit. Since these kstats are basically for RPC procedures (both NFS and NFS_ACL protocols) and they are in a form of I/O kstats, I thought about "rpcproc", "procio", "rpcprocio", or "io-rpcproc"...

    2. Re the overloading state and pseudo|real:
      
      Very good idea.
      
      I implemented the "pseudo" marking this way because it is very easy to do it this way now.  If we would like to present the pseudo / as this (the first way):
      
      nfs:2:share:path    /
      nfs:2:share:filesystem    pseudo
      
      the change (to the "path" kstat implementation) might be non-trivial.  To make the change simpler, what about this? (the second way)
      
      nfs:2:share:path    / (pseudo)
      nfs:2:share:filesystem    pseudo
      
      This will both make the implementation simple, and allow the kstat consumer to know that "(pseudo)" is the pseudo file system marking and not a part of the shared path name.
      
      If I do not hear otherwise soon, I'll start to implement it the first way (the path without the "(pseudo)" add-on).
      
      Thanks.
    3. Allow me to clarify, I think we can use the class as a differentiator between the versions as well as differentiating from other "misc" kstats.
      For example, an analyst might request all of the queue depth stats for NFSv3 only. Since the instance is used per-share, it helps to have another ordinal that is not embedded with the name. For example,
      
      kstat -pc rpcprocio_v3
      
      is clearer than:
      
      kstat -p -m nfs -n rpcprocio* | grep v3
      
      Is this a feasible use of class?
    4. I'm not sure if that's the feasible use of class, but instead of grep you could use this:
      
      kstat -p -m nfs -n "rfsprocio_v3*"
      
      Wouldn't that work for you?
    5. for kstat(1), yes
      for kstat(3kstat), not so much... we'd have to do a regex match against the entries in the chain, vs a simple strcmp

      In short, I see absolutely no value to class=misc, but do see value in class=rfsprocio_v3

    6. That makes sense.  Thank you.
  2. 
      
marcel
marcel
marcel
relling
  1. Ship It!
  2. 
      
marcel
marcel
marcel
Review request changed

Status: Closed (submitted)

Change Summary:

commit 22146ea93e24c7deb02c49c33b2ab98605ce78b4
Author: Marcel Telka <marcel.telka@nexenta.com>
Date:   Wed Sep 2 08:18:08 2015 +0200

    6090 IOPS, bandwidth, and latency kstats for NFS server
    Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
    Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
    Approved by: Dan McDonald <danmcd@omniti.com>

commit cf98b944cdc2063fc14f3fd525e284de3ed29fd0
Author: Marcel Telka <marcel.telka@nexenta.com>
Date:   Thu Sep 3 19:47:20 2015 +0200

    5907 xdrmblk_getpos() is unreliable
    Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
    Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
    Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
    Approved by: Dan McDonald <danmcd@omniti.com>
Loading...