4518 lockd: Cannot establish NLM service over <file desc. 9, protocol udp>

Review Request #13 - Created March 27, 2015 and submitted

Information
Marcel Telka
illumos-gate
4518
Reviewers
general

webrev: http://cr.illumos.org/~webrev/marcel/il-4518-lockd/

The fix:

  • removes the cond_wait() call in sm_simu_crash_svc() function (file sm_proc.c)
    to avoid the EIO error,

  • implements the daemonize_init()/daemonize_fini() call sequence in the main()
    function (file sm_svc.c) to help with the ENOENT error,

  • moves the merge_hosts() and merge_ips() calls from the main() to a separate
    asynchronous thread (named thr_statd_merges()) and lets the statd_call_statd()
    function (in sm_statd.c) to wait until the thr_statd_merges() is completed.
    This fixes the ENOENT error.

More technical details (if anyone is interested) are in my March comments in #4518.

Note: 10.0.0.99 is IP address of a non-existing host.

Without the fix:
----------------

# svcadm disable nfs/nlockmgr
# svcadm disable nfs/status
# ln -s test /var/statmon/sm.bak/ipv4.10.0.0.99
# svcadm enable nfs/status ; svcadm enable nfs/nlockmgr
# sleep 70
# svcs -xv
svc:/network/nfs/nlockmgr:default (NFS lock manager)
 State: maintenance since Wed Mar 25 16:45:58 2015
Reason: Start method failed repeatedly, last exited with status 1.
   See: http://illumos.org/msg/SMF-8000-KS
   See: man -M /usr/share/man -s 1M lockd
   See: /var/svc/log/network-nfs-nlockmgr:default.log
Impact: This service is not running.
# grep "NLM service" /var/adm/messages | tail -1
Mar 25 16:45:58 openindiana /usr/lib/nfs/lockd[7174]: [ID 491006 daemon.error] Cannot establish NLM service over <file desc. 9, protocol udp> : I/O error. Exiting
#

# svcadm disable nfs/nlockmgr
# svcadm disable nfs/status
# gsed -i -e 's/^nameserver.*$/nameserver 10.0.0.99/' /etc/resolv.conf
# svcadm enable nfs/status ; svcadm enable nfs/nlockmgr
# sleep 70
# svcs -xv
svc:/network/nfs/nlockmgr:default (NFS lock manager)
 State: maintenance since Wed Mar 25 16:55:03 2015
Reason: Start method failed repeatedly, last exited with status 1.
   See: http://illumos.org/msg/SMF-8000-KS
   See: man -M /usr/share/man -s 1M lockd
   See: /var/svc/log/network-nfs-nlockmgr:default.log
Impact: This service is not running.
# grep "NLM service" /var/adm/messages | tail -1
Mar 25 16:55:03 openindiana /usr/lib/nfs/lockd[7233]: [ID 491006 daemon.error] Cannot establish NLM service over <file desc. 9, protocol udp> : No such file or directory. Exiting
#


With the fix:
-------------

# svcadm disable nfs/nlockmgr
# svcadm disable nfs/status
# ln -s test /var/statmon/sm.bak/ipv4.10.0.0.99
# svcadm enable nfs/status ; svcadm enable nfs/nlockmgr
# sleep 70
# svcs -xv
# grep "NLM service" /var/adm/messages | tail -1
#

# svcadm disable nfs/nlockmgr
# svcadm disable nfs/status
# gsed -i -e 's/^nameserver.*$/nameserver 10.0.0.99/' /etc/resolv.conf
# svcadm enable nfs/status ; svcadm enable nfs/nlockmgr
# sleep 70
# svcs -xv
# grep "NLM service" /var/adm/messages | tail -1
#
Gordon Ross
Marcel Telka
Review request changed

Status: Closed (submitted)

Change Summary:

commit 98573c1925f3692d1e8ea9eb018cb915fc0becc5
Author:     Marcel Telka <marcel.telka@nexenta.com>
AuthorDate: Fri Mar 27 15:18:01 2015 +0100
Commit:     Dan McDonald <danmcd@omniti.com>
CommitDate: Sat Mar 28 16:12:41 2015 -0400

    4518 lockd: Cannot establish NLM service over <file desc. 9, protocol udp>
    Reviewed by: Gordon Ross <gordon.ross@nexenta.com>
    Reviewed by: Dan McDonald <danmcd@omniti.com>
    Approved by: Dan McDonald <danmcd@omniti.com>

:100644 100644 1e55770... 8a74b88... M  usr/src/cmd/fs.d/nfs/statd/Makefile
:100644 100644 22592eb... 00d8d59... M  usr/src/cmd/fs.d/nfs/statd/sm_proc.c
:100644 100644 420dd68... e7805e7... M  usr/src/cmd/fs.d/nfs/statd/sm_statd.c
:100644 100644 e1a5974... aa1368b... M  usr/src/cmd/fs.d/nfs/statd/sm_statd.h
:100644 100644 1f657a0... 37ef788... M  usr/src/cmd/fs.d/nfs/statd/sm_svc.c
Loading...