4518 lockd: Cannot establish NLM service over <file desc. 9, protocol udp>
Review Request #13 — Created March 27, 2015 and submitted — Latest diff uploaded
Information | |
---|---|
marcel | |
illumos-gate | |
4518 | |
Reviewers | |
general | |
webrev: http://cr.illumos.org/~webrev/marcel/il-4518-lockd/
The fix:
-
removes the cond_wait() call in sm_simu_crash_svc() function (file sm_proc.c)
to avoid the EIO error, -
implements the daemonize_init()/daemonize_fini() call sequence in the main()
function (file sm_svc.c) to help with the ENOENT error, -
moves the merge_hosts() and merge_ips() calls from the main() to a separate
asynchronous thread (named thr_statd_merges()) and lets the statd_call_statd()
function (in sm_statd.c) to wait until the thr_statd_merges() is completed.
This fixes the ENOENT error.
More technical details (if anyone is interested) are in my March comments in #4518.
Note: 10.0.0.99 is IP address of a non-existing host. Without the fix: ---------------- # svcadm disable nfs/nlockmgr # svcadm disable nfs/status # ln -s test /var/statmon/sm.bak/ipv4.10.0.0.99 # svcadm enable nfs/status ; svcadm enable nfs/nlockmgr # sleep 70 # svcs -xv svc:/network/nfs/nlockmgr:default (NFS lock manager) State: maintenance since Wed Mar 25 16:45:58 2015 Reason: Start method failed repeatedly, last exited with status 1. See: http://illumos.org/msg/SMF-8000-KS See: man -M /usr/share/man -s 1M lockd See: /var/svc/log/network-nfs-nlockmgr:default.log Impact: This service is not running. # grep "NLM service" /var/adm/messages | tail -1 Mar 25 16:45:58 openindiana /usr/lib/nfs/lockd[7174]: [ID 491006 daemon.error] Cannot establish NLM service over <file desc. 9, protocol udp> : I/O error. Exiting # # svcadm disable nfs/nlockmgr # svcadm disable nfs/status # gsed -i -e 's/^nameserver.*$/nameserver 10.0.0.99/' /etc/resolv.conf # svcadm enable nfs/status ; svcadm enable nfs/nlockmgr # sleep 70 # svcs -xv svc:/network/nfs/nlockmgr:default (NFS lock manager) State: maintenance since Wed Mar 25 16:55:03 2015 Reason: Start method failed repeatedly, last exited with status 1. See: http://illumos.org/msg/SMF-8000-KS See: man -M /usr/share/man -s 1M lockd See: /var/svc/log/network-nfs-nlockmgr:default.log Impact: This service is not running. # grep "NLM service" /var/adm/messages | tail -1 Mar 25 16:55:03 openindiana /usr/lib/nfs/lockd[7233]: [ID 491006 daemon.error] Cannot establish NLM service over <file desc. 9, protocol udp> : No such file or directory. Exiting # With the fix: ------------- # svcadm disable nfs/nlockmgr # svcadm disable nfs/status # ln -s test /var/statmon/sm.bak/ipv4.10.0.0.99 # svcadm enable nfs/status ; svcadm enable nfs/nlockmgr # sleep 70 # svcs -xv # grep "NLM service" /var/adm/messages | tail -1 # # svcadm disable nfs/nlockmgr # svcadm disable nfs/status # gsed -i -e 's/^nameserver.*$/nameserver 10.0.0.99/' /etc/resolv.conf # svcadm enable nfs/status ; svcadm enable nfs/nlockmgr # sleep 70 # svcs -xv # grep "NLM service" /var/adm/messages | tail -1 #