Project

General

Profile

Actions

Bug #6007

open

assertion failed: TAILQ_EMPTY(&hostp->nh_vholds_list), file: ../../common/klm/nlm_impl.c, line: 1161

Added by Marcel Telka almost 7 years ago. Updated almost 7 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
nfs - NFS server and client
Start date:
2015-06-15
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

Following panic happened on the recent (commit 0afb687) debug build:

> ::status
debugging crash dump vmcore.1 (64-bit) from t1
operating system: 5.11 il-nightly (i86pc)
image uuid: b15f6085-6ff2-65f8-e117-a048227c057b
panic message: assertion failed: TAILQ_EMPTY(&hostp->nh_vholds_list), file: ../../common/klm/nlm_impl.c, line: 1161
dump content: kernel pages only
> ::stack
vpanic()
0xfffffffffbdf2748()
nlm_host_destroy+0x4b(ffffff00ed6d9e78)
nlm_svc_stopping+0x13b(ffffff00dd953780)
lm_shutdown+0x7b()
nfssys+0x5bd(8, 0)
_sys_sysenter_post_swapgs+0x237()
>

The crash dump is available here: http://telka.sk/illumos/6007/vmdump.1

Steps to reproduce

I'm able to reproduce the panic almost always. It looks like the long enough delays between the steps are required (more than 2 minutes looks safe enough to get it reproduced always), so it looks like the problem might be related to the NLM garbage collector.

For the testing I use these three machines:
t1 - the NFS server with the recent illumos debug build (commit 0afb687) and shared /export
t2 - the first NFS client mounting using NFSv3 (mount -o vers=3 t1:/export /mnt)
t3 - the second NFS client mounting using NFSv4 (mount -o vers=4 t1:/export /mnt)

For the test I use the attached lock.c program (gcc -Wall -o lock lock.c).

Here is the log what I did (format: HH:MM:SS@hostname):

17:19:12@t2 # ./lock R /mnt/file (t2 locked the file)
17:19:15@t3 # ./lock W /mnt/file (t3 blocked)
(wait about 2 minutes)
17:21:20@t2 # (terminate the lock command using Ctrl+C, wait until t3 acquires the lock)
17:22:47@t3 # (t3 acquired the lock)
17:22:59@t2 # ./lock R /mnt/file (t2 blocked)
(wait about 2 minutes)
17:25:15@t3 # (terminate the lock command using Ctrl+C, t2 acquired the lock immediately)
17:25:22@t3 # ./lock W /mnt/file (t3 blocked)
(wait about 2 minutes)
17:27:46@t2 # (terminate the lock command using Ctrl+C, wait until t3 acquires the lock)
17:29:00@t3 # (t3 acquired the lock)
17:29:07@t2 # ./lock R /mnt/file (t2 blocked)
(wait about 2 minutes)
17:31:14@t1 # svcadm restart nlockmgr (now the server should panic)


Files

lock.c (1.41 KB) lock.c Marcel Telka, 2015-08-13 11:15 AM
Actions #1

Updated by Marcel Telka almost 7 years ago

  • Description updated (diff)
Actions #2

Updated by Marcel Telka almost 7 years ago

Actions

Also available in: Atom PDF