Project

General

Profile

Bug #4610

lockd hangs

Added by Lauradel Collins over 6 years ago.

Status:
New
Priority:
Urgent
Assignee:
-
Category:
nfs - NFS server and client
Start date:
2014-02-17
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

We are having significant problems with lockd. It hangs a several times a day and doesn't seem to recover without killing and restarting it. We are running Open Indiana "SunOS cranium 5.11 illumos-59d8f10 i86pc i386 i86pc" hipster release. When it hangs I can test with `rpcinfo`.

root@cranium: ~ 201# rpcinfo -t cranium nlockmgr
rpcinfo: RPC: Timed out

The problems started in early December when we had a system crash (for unknown reasons) and decided to update to OS before we put it back into production. We have had this lockd problem since. I have a truss of lockd from that I collected yesterday that goes from when we started it until it became unresponsive and was killed. We have a kill script that checks every 5 minutes and kills lockd if it is unresponsive. The truss file is quite long (15000 lines) so I put it on my web server <http://ix.cs.uoregon.edu/~paul/lockd/truss_lockd-20140213-17-53-24.txt&gt;. The hang could have started anytime in the last 5 minutes of the truss, but it was definitely hung when it was killed.

I have a (8GB) crash dump taken while lockd was hung:

<https://ix.cs.uoregon.edu/~paul/lockd/>

Also available in: Atom PDF