Project

General

Profile

Bug #9447

NFS unmount is slow

Added by Marcel Telka over 1 year ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
nfs - NFS server and client
Start date:
2018-04-10
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

The NFS unmount operation is slow at the NFS client when there is a non-trivial amount of files. With thousands of file systems mounted via NFSv4 (and mirror mounts) this can cause the NFS harvester to consume one full CPU to do its work. The problem is visible at NFSv2 and NFSv3 clients too.


Files

repro-slow-nfs-umount (2.23 KB) repro-slow-nfs-umount Marcel Telka, 2018-04-10 08:19 AM

History

#1

Updated by Marcel Telka over 1 year ago

Root cause

The problem is that check_rtable()/check_rtable4(), destroy_rtable()/destroy_rtable4(), and rflush()/r4flush() needs to walk through the whole rtable/rtable4 to do their work, even the majority of rnodes does not belong to the vfs they are working on.

The fix

The fix introduces new linked list in mntinfo. The linked list contains all rnodes that belongs to the particular mntinfo. All of the slow functions (listed above) are modified to work with the new linked list when possible, instead of traversing the whole rtable/rtable4.

#2

Updated by Marcel Telka over 1 year ago

Testing

I tested (and reproduced) this with one NFS client machine and three NFS servers. At all three NFS servers do this preparation:

# for i in `seq 0 31` ; do for j in `seq 0 31` ; do zfs create -p rpool/TEST/$i/$j ; done ; done
# zfs create -p rpool/TEST/files
# for i in `seq 0 255` ; do mkdir -p /rpool/TEST/files/$i ; for j in `seq 0 255` ; do touch /rpool/TEST/files/$i/$j ; done ; done
# zfs set sharenfs=on rpool/TEST

To actually test the NFS client run the attached repro-slow-nfs-umount script.

Here are the test results without the fix:

root@client:~# ./repro-slow-nfs-umount
Creating mount points... done
NFSv2 mount/unmount cycle test
        Mounting the filesystems... done
        Fill rtable... done in 86 seconds
        Mounting/unmounting /mnt/1 filesystem 1000 times... done in 113 seconds
        Unmounting the /mnt/2 filesystem... done in 0 seconds
        Unmounting the /mnt/3 filesystem... done in 0 seconds
NFSv3 mount/unmount cycle test
        Mounting the filesystems... done
        Fill rtable... done in 28 seconds
        Mounting/unmounting /mnt/1 filesystem 1000 times... done in 121 seconds
        Unmounting the /mnt/2 filesystem... done in 0 seconds
        Unmounting the /mnt/3 filesystem... done in 0 seconds
NFSv4 mount/unmount cycle test
        Mounting the filesystems... done
        Fill rtable... done in 34 seconds
        Mounting/unmounting /mnt/1 filesystem 1000 times... done in 122 seconds
        Unmounting the /mnt/2 filesystem... done in 0 seconds
        Unmounting the /mnt/3 filesystem... done in 0 seconds
NFSv4 mirror mount test
        Mounting the filesystems... done
        Trigger mirror mounts and fill rtable4... done in 90 seconds
        Unmounting /mnt/1 filesystems... done in 111 seconds
        Unmounting /mnt/2 filesystems... done in 73 seconds
        Unmounting /mnt/3 filesystems... done in 25 seconds
Removing mount points... done

  nfs3_unmount                                      
           value  ------------- Distribution ------------- count    
           16384 |                                         0        
           32768 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1000     
           65536 |                                         2        
          131072 |                                         0        

  nfs_unmount                                       
           value  ------------- Distribution ------------- count    
           16384 |                                         0        
           32768 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 998      
           65536 |                                         4        
          131072 |                                         0        

  nfs4_unmount                                      
           value  ------------- Distribution ------------- count    
             256 |                                         0        
             512 |                                         2        
            1024 |                                         20       
            2048 |                                         2        
            4096 |                                         0        
            8192 |@@@@@@@@@                                971      
           16384 |@                                        80       
           32768 |@@@@@@@@@@@@@@@@@@@@@@@@@@               2772     
           65536 |@@@                                      335      
          131072 |                                         3        
          262144 |                                         0        

root@client:~#

Test results with the fix:

root@client:~# ./repro-slow-nfs-umount
Creating mount points... done
NFSv2 mount/unmount cycle test
        Mounting the filesystems... done
        Fill rtable... done in 101 seconds
        Mounting/unmounting /mnt/1 filesystem 1000 times... done in 53 seconds
        Unmounting the /mnt/2 filesystem... done in 0 seconds
        Unmounting the /mnt/3 filesystem... done in 0 seconds
NFSv3 mount/unmount cycle test
        Mounting the filesystems... done
        Fill rtable... done in 30 seconds
        Mounting/unmounting /mnt/1 filesystem 1000 times... done in 53 seconds
        Unmounting the /mnt/2 filesystem... done in 0 seconds
        Unmounting the /mnt/3 filesystem... done in 0 seconds
NFSv4 mount/unmount cycle test
        Mounting the filesystems... done
        Fill rtable... done in 35 seconds
        Mounting/unmounting /mnt/1 filesystem 1000 times... done in 58 seconds
        Unmounting the /mnt/2 filesystem... done in 1 seconds
        Unmounting the /mnt/3 filesystem... done in 0 seconds
NFSv4 mirror mount test
        Mounting the filesystems... done
        Trigger mirror mounts and fill rtable4... done in 80 seconds
        Unmounting /mnt/1 filesystems... done in 13 seconds
        Unmounting /mnt/2 filesystems... done in 9 seconds
        Unmounting /mnt/3 filesystems... done in 3 seconds
Removing mount points... done

  nfs_unmount
           value  ------------- Distribution ------------- count
             256 |                                         0
             512 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@     912
            1024 |@@@@                                     88
            2048 |                                         0
            4096 |                                         0
            8192 |                                         0
           16384 |                                         0
           32768 |                                         2
           65536 |                                         0

  nfs3_unmount
           value  ------------- Distribution ------------- count
             256 |                                         0
             512 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@     908
            1024 |@@@@                                     91
            2048 |                                         1
            4096 |                                         0
            8192 |                                         0
           16384 |                                         0
           32768 |                                         2
           65536 |                                         0

  nfs4_unmount
           value  ------------- Distribution ------------- count
               4 |                                         0
               8 |                                         1
              16 |                                         1
              32 |                                         0
              64 |                                         0
             128 |@@@@@@                                   597
             256 |@@@@                                     442
             512 |@@@@@@@@@@@@@@@@@@@@@@@@@@@              2812
            1024 |@@@                                      314
            2048 |                                         4
            4096 |                                         2
            8192 |                                         0
           16384 |                                         0
           32768 |                                         1
           65536 |                                         3
          131072 |                                         1
          262144 |                                         0

root@client:~#

As we can see the new implementation is about 50 times faster in the actual NFS unmount functions in kernel. The speed up is visible even with the simple mount(1m)/umount(1m) cycle. Here we saved about half of the time.

#4

Updated by Marcel Telka over 1 year ago

  • Status changed from In Progress to Pending RTI
#5

Updated by Electric Monk over 1 year ago

  • Status changed from Pending RTI to Closed
  • % Done changed from 0 to 100

git commit e010bda94b034e413b6fe35fd45bca0afaf1a0df

commit  e010bda94b034e413b6fe35fd45bca0afaf1a0df
Author: Marcel Telka <marcel@telka.sk>
Date:   2018-07-19T00:10:13.000Z

    9447 NFS unmount is slow
    Reviewed by: Toomas Soome <tsoome@me.com>
    Reviewed by: Arne Jansen <arne@die-jansens.de>
    Reviewed by: Ken Mays <kmays2000@gmail.com>
    Reviewed by: Evan Layton <evan.layton@nexenta.com>
    Approved by: Dan McDonald <danmcd@joyent.com>

Also available in: Atom PDF