Project

General

Profile

Bug #1588

nfs4 mirror mount hang

Added by Arne Jansen about 8 years ago. Updated almost 7 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
nfs - NFS server and client
Start date:
2011-10-01
Due date:
% Done:

100%

Estimated time:
2.00 h
Difficulty:
Medium
Tags:

Description

nfs4 mirror mounting facility hangs when the process receives a signal during on-demand mounting. The problem is that nfs4_trigger_domount_args_create() fails to check the return value of nfs4_trigger_ping_server() for EINTR, thus looping endlessly when a signal is pending.
Alternatively it could flag the call with nointr.


Files

History

#1

Updated by Arne Jansen about 8 years ago

A hotfix for oi_148:

echo 'nfs4_trigger_domount_args_create+0x34/v 0' | mdb -kw

This flags all calls to nfs4_trigger_ping_server as nointr.

#2

Updated by Albert Lee about 8 years ago

  • Category set to nfs - NFS server and client
  • Difficulty changed from Expert to Medium
#3

Updated by Simon Klinkert almost 8 years ago

I implemented a better error handling in nfs4_stub_vnops.c (see attachment). The function nfs4_trigger_domount_args_create() now returns EINTR if there's a RPC_INTR from nfs4_trigger_ping_server(), EINVAL in case of other errors or zero if everything is fine. I tested this patch on openindiana 151a and it works for me.

It's a git patch file but I think it's useful for you, too.

#4

Updated by Simon Klinkert almost 8 years ago

It's a little bit difficult to provoke this bug but here is what I did to reproduce this bug:

I used two openindiana 151a machines.

host1: I used dtrace to see if it works:

dtrace -n 'nfs4_trigger_domount_args_create:* { trace(arg1); }' -n 'nfs4_trigger_ping_server:* { trace(arg1); }'  and/or
dtrace -n 'nfs4*:return {trace(arg1);}'

host1: I mounted a filesystem via nfs from host2
host2: Typed "uadmin 1 1" to reboot
host1: I moved with cd as fast as possible(!) into a deeper filesystem from host2
host1: After some time I saw a few calls for nfs4_trigger_ping_server() in my dtrace output and I sent my cd process a SIGINTR with a ctrl+c. (If you can't see the call it's time to start over.) cd hung and I was able to see a pending interrupt signal from nfs4_trigger_ping_server() with dtrace.

With my latest patch (http://cr.illumos.org/view/e4qg2vxm/nfs-webrev/) dtrace looks like the following:

  3  65386  nfs4_ephemeral_tree_hold:return                 0
  2  65416   nfs4_ping_server_common:return                 5
  2  65402  nfs4_trigger_ping_server:return                 5
  2  65416   nfs4_ping_server_common:return                 5
  2  65402  nfs4_trigger_ping_server:return                 5
  2  65416   nfs4_ping_server_common:return                18
  2  65402  nfs4_trigger_ping_server:return                18
  2  65408  nfs4_trigger_esi_destroy:return                 2
  2  65396 nfs4_trigger_domount_args_create:return                 4
  2  65388  nfs4_ephemeral_tree_decr:return                 8
  2  65392        nfs4_trigger_mount:return                 4
  2  65364       nfs4_trigger_access:return                 4
  2  65921 nfs4_waitfor_purge_complete:return                 0
  2  65921 nfs4_waitfor_purge_complete:return                 0
  2  66129      nfs4_validate_caches:return                 0
  2  65921 nfs4_waitfor_purge_complete:return                 0

...and cd aborts.

#5

Updated by Rich Lowe almost 8 years ago

  • Subject changed from nfs4 Mirror Mount hang to nfs4 mirror mount hang
#6

Updated by Rich Lowe almost 8 years ago

  • % Done changed from 0 to 100
  • Tags deleted (needs-triage)

Resolved in r13609 commit:7442c4b86390

#7

Updated by Marcel Telka almost 7 years ago

  • Status changed from New to Resolved

Also available in: Atom PDF