Bug #4191


share_nfs(1m) charset handling is unreliable

Added by Marcel Telka almost 9 years ago.

In Progress
nfs - NFS server and client
Start date:
Due date:
% Done:


Estimated time:
Gerrit CR:
External Bug:


There are various bugs related to share_nfs(1m) charset handling and READDIR/READDIRPLUS implementation for NFSv2/3/4.

Issue 1: The READDIRPLUS3 reply might contain all filenames with zero byte as a first character.

Do this on the NFS server (the IP address is NOT the IP address of the NFS client):

# share -o "iso8859-15=@" /export/
# ls -la /export/
total 6
drwxr-xr-x  4 root sys   4 2013-08-19 10:22 .
drwxr-xr-x 24 root root 26 2013-08-16 20:09 ..
drwxrwxrwx  2 root root  2 2013-08-18 07:32 a
drwxr-xr-x  3 root root  3 2012-12-06 14:34 home

And do this on the NFS client:

# mount -o vers=3 /mnt
# ls -la /mnt
total 6
drwxr-xr-x 4 root sys 4 2013-08-19 10:22 
drwxr-xr-x 4 root sys 4 2013-08-19 10:22 
drwxr-xr-x 4 root sys 4 2013-08-19 10:22 
drwxr-xr-x 4 root sys 4 2013-08-19 10:22 
# cd /mnt
# ls -la
ls: cannot access : No such file or directory
ls: cannot access : No such file or directory
ls: cannot access : No such file or directory
ls: cannot access : No such file or directory
total 0
?????????? ? ? ? ?                ? 
?????????? ? ? ? ?                ? 
?????????? ? ? ? ?                ? 
?????????? ? ? ? ?                ? 

The issue is reproducible using various ways. For example:

  1. When the IP address of the NFS client IS NOT specified in the share options (see above).
  2. If the IP address of the NFS client IS covered by the charset option, but the IP address is not resolvable (i.e. "getent IP hosts" returns nothing).

All these ways demonstrates that the charset handling (see share_nfs(1m) for details) in the NFS server is unreliable.

The issue affects the filename character set conversions in both ways: inbound and outbound.

Issue 2: Page fault at nfscmd_dropped_entrysize+0x1e()

A customer faced this panic:

> ::status
debugging crash dump vmcore.0 (64-bit) from neptune1
operating system: 5.11 NexentaOS_134f (i86pc)
panic message: BAD TRAP: type=e (#pf Page fault) rp=ffffff00f5f23380 addr=ffffff2279a38ad6
dump content: kernel pages and pages from PID -1
> ::stack
nfscmd_dropped_entrysize+0x1e(ffffff2279a2a000, 3e, 1)
rfs3_readdirplus+0x6df(ffffff00f5f23730, ffffff00f5f23820, ffffff226ce93040, ffffff00f5f23a60, ffffff2259f55200)
common_dispatch+0x48b(ffffff00f5f23a60, ffffff2257ee7200, 2, 4, fffffffff83e6cd8, ffffffffc00ea480)
rfs_dispatch+0x2d(ffffff00f5f23a60, ffffff2257ee7200)
svc_getreq+0x19c(ffffff2257ee7200, ffffff21ff55fb20)
nfssys+0x765(e, febb0fbc)
dtrace_systrace_syscall32+0x11a(e, febb0fbc, 0, 1c3, 1, 0)

The nfscmd_dropped_entrysize() is called with swapped 2nd and 3rd parameters (drop and nents). This is causing customer's panices and hangs.

The issues could be reproduced/tested using these steps:

  1. Create a directory entry with invalid UTF-8 name, for example: mkdir /export/`echo -e '\\0250'`
  2. Share the directory with requested character conversion for your NFS client, for example: share -o iso8859-15=@ /export
  3. Mount the directory from the NFS client using NFSv3: mount -o vers=3 SERVER:/export /mnt
  4. Try to issue the READDIRPLUS NFS operation from the client: ls /mnt

This should either panic or hang (one CPU will cycle in nfscmd_dropped_entrysize()) the server.

Issue 3: nfscmd_convdirplus() could return negative return value

The nfscmd_convdirplus() could return negative value as a number of dropped entries. The return value is calculated incorrectly at the last line of the nfscmd_convdirplus() function. The current expression (nents - (i + skipped)) should be replaced by correct (nents - i + skipped).

Issue 4: The calculation of the space reduced by drops in nfscmd_dropped_entrysize() is completely broken.

The nfscmd_convdirplus() could drop any entry, either from the beginning of the dir, from the middle, or from the end (or any combination of that). The nfscmd_dropped_entrysize() calculates the "dropped" size from last entries in the dir only. IOW, it assumes that the nfscmd_convdirplus() is dropping the entries from the end of the directory only.

Issue 5: The RPC response for READDIR is not encoded properly (NFSv4)

When the reproduction steps for Issue 2 are tried with NFSv4 (-o vers=4), the NFS server will improperly encode the response:

# ls -la
NFS compound failed for server t1: error 2 (RPC: Can't decode result)
ls: reading directory .: I/O error
total 0

Issue 6: NFSv2/3/4: READDIR responses are inconsistent when charset conversion fails

In a case the character conversion of the filenames fails in NFSv4 READDIR operation, the file is always skipped (so it is not visible by the NFS client).

OTOH, in the NFSv2 READDIR and NFSv3 READDIR/READDIRPLUS operations implementation there are some cases when the filename (where the charset conversion failed) is either returned unmodified, or truncated.

This might cause that the NFS client will see different directory content when accessed by various NFS versions.

Issue 7: rfs3_readdir(): Issues related to nfscmd_convdirent()

There are two issues related to the nfscmd_convdirent() usage in rfs3_readdir():

  1. The data passed to the nfscmd_convdirent() are leaked when the charset conversion is successful
  2. Only one (the first) entry in the directory is converted, despite the fact that the READDIR operation might return more than one entry.

Fortunately, it seems the READDIR operation is not widely used (the READDIRPLUS is usually preferred).

Issue 8: Uninitialized data in dtrace probes

In a case the rfs3_readdir() or rfs3_readdirplus() fails and goto to the out1 label is executed the resp->resfail.dir_attributes is set after the dtrace probe is fired, so the probe consumer will see the uninitialized resp structure.

Issue 9: Possible null pointer dereference in rfs3_readdirplus()

nvap could be NULL at line 3611:

3607        nvap = rfs4_delegated_getattr(nvp, &nva, 0, cr) ? NULL : &nva;
3609        /* Lie about the object type for a referral */
3610        if (vn_is_nfs_reparse(nvp, cr))
3611            nvap->va_type = VLNK;

No data to display


Also available in: Atom PDF