Bug #3485
closedstatd is not handling incoming SM_NOTIFY properly when the hostname is not resolvable
90%
Description
Steps to reproduce:
Prerequisites:
- illumos based NFS server named SERVER
- an NFS client with hostname CLIENT. The client's hostname CLIENT should not be resolvable at the SERVER.
Steps:
- On the NFS server share a directory and make it writable by everybody. Let say the shared directory is /DIR.
- On the NFS client mount the shared directory from the SERVER:
mount -o vers=3 SERVER:/DIR /mnt
- On the client run an application that will lock a file. I use the locker testing application (see below for the source):
./locker /mnt/a
- At the server make sure the lock is held:
echo ::nlm_lockson | mdb -k
- Turn off the NFS client (normal reboot is not enough!).
- Boot the NFS client back.
- At the NFS server check whether there is still the lock held by the NFS client:
echo ::nlm_lockson | mdb -k
Issue: The lock is still held at the NFS server.
Expected results: The lock should no longer be there.
/* * locker.c */ #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> #include <stdio.h> int main(int argc, char *argv[]) { int f; if (argc != 2) { printf("missing filename\\n"); return 1; } f = open(argv[1], O_CREAT | O_RDWR, 0777); if (f < 0) { printf("open failed\\n"); return 1; } if (lockf(f, F_LOCK, 0) != 0) { printf("lockf failed\\n"); return 1; } for (;;) sleep(10); return 0; }
Updated by Marcel Telka over 10 years ago
- Status changed from New to In Progress
Updated by Marcel Telka over 10 years ago
Root cause:
In case the client's hostname is not resolvable at the NFS server and the client's statd will send the SM_NOTIFY to the server, the server's statd will compare the received hostname with the monitored hostnames in the send_notice() using hostname_eq().
In my testing the client's hostname was 'centos-6' (not resolvable at the server). And this happened (9004 is pid of statd):
# dtrace -p 9004 -n 'pid$target:a.out:hostname_eq:entry {trace(copyinstr(arg0)); trace(copyinstr(arg1))} pid$target:a.out:hostname_eq:return {trace(arg1)} pid$target:a.out:get_system_id:entry {trace(copyinstr(arg0))} pid$target:a.out:get_system_id:return {trace(arg1)}' dtrace: description 'pid$target:a.out:hostname_eq:entry ' matched 4 probes CPU ID FUNCTION:NAME 0 74713 hostname_eq:entry centos-6 centos-6 0 74715 get_system_id:entry centos-6 0 74716 get_system_id:return 0 0 74715 get_system_id:entry centos-6 0 74716 get_system_id:return 0 0 74714 hostname_eq:return 0
So hostname_eq() got two equal strings (centos-6), but the return value was zero. This is because the hostname_eq() tries to translate the hostname to some universal identifier. This is done by translating the hostnames to network addresses - in function get_system_id() . Since the hostname is not resolvable, the get_system_id() failed and returned NULL.
The hostname_eq() needs to be made more robust and it needs to be able to decide that two exactly same strings are really same.
Updated by Marcel Telka over 10 years ago
- Status changed from In Progress to Pending RTI
- % Done changed from 0 to 90
Updated by Christopher Siden over 10 years ago
commit b17f03d7d89b75b69b9b7db22f2316b700e3a5a8 Author: Marcel Telka <marcel.telka@nexenta.com> Date: Fri Jan 25 10:08:50 2013 3485 statd is not handling incoming SM_NOTIFY properly when the hostname is not resolvable Reviewed by: Gordon Ross <Gordon.Ross@nexenta.com> Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Reviewed by: Dan McDonald <danmcd@nexenta.com> Reviewed by: Albert Lee <trisk@nexenta.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Reviewed by: Jeremy Jones <jeremy@delphix.com> Approved by: Christopher Siden <csiden@delphix.com>
Updated by Christopher Siden over 10 years ago
- Status changed from Pending RTI to Resolved