Project

General

Profile

Actions

Bug #3976

closed

sdev_readdir() recursively acquires sdev_contents as reader

Added by Robert Mustacchi almost 9 years ago. Updated almost 9 years ago.

Status:
Resolved
Priority:
Normal
Category:
kernel
Start date:
2013-08-04
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

Looking at the dump, we have a bunch of threads corked on an
sdev_contents rwlock:

> ::stacks -m dev
THREAD           STATE    SOBJ                COUNT
ffffffc09d4034a0 SLEEP    RWLOCK                100
                 swtch+0x145
                 turnstile_block+0x760
                 rw_enter_sleep+0x205
                 sdev_access+0x55
                 fop_access+0x8c
                 vn_openat+0x417
                 copen+0x49e
                 openat+0x2d

ffffff7e076f83e0 SLEEP    RWLOCK                  1
                 swtch+0x145
                 turnstile_block+0x760
                 rw_enter_sleep+0x1a3
                 devname_inactive_func+0x3e
                 sdev_inactive+0x20
                 fop_inactive+0xaf
                 vn_rele+0x5f
                 lookuppnvp+0x8f6
                 lookuppnatcred+0x11b
                 lookupnameatcred+0x97
                 lookupnameat+0x69
                 cstatat_getvp+0x12b
                 cstatat+0x5c
                 fstatat+0x4c
                 stat+0x25

ffffff0d422a2840 SLEEP    RWLOCK                  1
                 swtch+0x145
                 turnstile_block+0x760
                 rw_enter_sleep+0x205
                 sdev_access+0x55
                 fop_access+0x8c
                 sdev_lookup+0x4c
                 fop_lookup+0xed
                 lookuppnvp+0x28f
                 lookuppnatcred+0x11b
                 lookupnameatcred+0x97
                 lookupnameat+0x69
                 readlinkat+0x98
                 readlink32+0x31
                 sys_syscall32+0xff

ffffffbb11aeb0c0 SLEEP    RWLOCK                  1
                 swtch+0x145
                 turnstile_block+0x760
                 rw_enter_sleep+0x205
                 sdev_access+0x55
                 fop_access+0x8c      
                 sdev_readdir+0x4f
                 fop_readdir+0xab
                 getdents64+0xd1

Most of these are "ps" invocations – but devfsadm has also gotten into
the knot:

> ::stacks -m dev | ::print kthread_t t_procp->p_user.u_comm ! sort | uniq -c
   1 t_procp->p_user.u_comm = [ "devfsadm" ]
 102 t_procp->p_user.u_comm = [ "ps" ]

This devfsadm is in turn corking zoneadmd, which is blocking provisioning.
So why are we blocked on the sdev_contents lock? Suspiciously,
WRITE_WANTED is set on the rwlock, but it's held as reader:

> 0xffffffa604a54c28::rwlock
            ADDR      OWNER/COUNT FLAGS          WAITERS
ffffffa604a54c28        READERS=1  B011 ffffff406a03fb00 (R)
                                     || ffffff806d141bc0 (R)
                 WRITE_WANTED -------+| ffffffd01d7f24e0 (R)
                  HAS_WAITERS --------+ ffffff10fccc8bc0 (R)
                                        ffffff645aa10060 (R)
                                        ffffff5e204a38c0 (R)
                                        ffffffdf697754c0 (R)
                                        ffffff7fa4d10080 (R)
                                        ffffff37c8a18b20 (R)
                                        ffffffda319068a0 (R)
                                        ffffffa53dd310a0 (R)
                                        ffffff75f50d6800 (R)
                                        ffffff239b5f1160 (R)
                                        ffffffbf68eb4b40 (R)
                                        ffffff259c2534a0 (R)
                                        ffffff1e8df55760 (R)
                                        ffffff1e940b2b00 (R)
                                        ffffffdecb384520 (R)
                                        ffffff5c329c83e0 (R)
                                        ffffffbe7020c400 (R)
                                        ffffffda31908140 (R)
                                        ffffffbb11a928c0 (R)
                                        ffffff7fa4d0bc00 (R)
                                        fffffffb55a96c20 (R)
                                        ffffff406a030740 (R)
                                        ffffff239b5e8460 (R)
                                        ffffffd050c62060 (R)
                                        ffffff3a9833d3a0 (R)
                                        ffffff7a64c20000 (R)
                                        ffffffdecb37dba0 (R)
                                        ffffff0fe0ec3c40 (R)
                                        fffffffb56419ae0 (R)
                                        ffffff3ebabc0ae0 (R)
                                        ffffff90641f1440 (R)
                                        fffffffb5640e020 (R)
                                        ffffff3ebac56140 (R)
                                        ffffffff4ff11c00 (R)
                                        ffffff7fa4d088c0 (R)
                                        fffffffb5638a000 (R)
                                        ffffffdc79409140 (R)
                                        ffffff7a64c2fb60 (R)
                                        fffffffb5638e120 (R)
                                        ffffffbbf200a3c0 (R)
                                        ffffff8fc5c56780 (R)
                                        ffffffdecb9620c0 (R)
                                        ffffff0fe0e69120 (R)
                                        ffffffa53dd59180 (R)
                                        fffffff43b350120 (R)
                                        ffffffdc7935cb00 (R)
                                        ffffffc09d3f28c0 (R)
                                        ffffff595a9debc0 (R)
                                        ffffff7a644c4520 (R)
                                        fffffffb5640e760 (R)
                                        ffffff37c8a324a0 (R)
                                        ffffff7e06219060 (R)
                                        ffffff7e062150e0 (R)
                                        ffffffe15ddfeb40 (R)
                                        ffffffbbf2008780 (R)
                                        ffffff7e06217440 (R)
                                        ffffffeefea5f440 (R)
                                        ffffff7fa381c860 (R)
                                        ffffff7fa4d073a0 (R)
                                        fffffffb5640eb00 (R)
                                        ffffffbd92cebc00 (R)
                                        ffffffdecb974400 (R)
                                        ffffffa53dd27000 (R)
                                        fffffff374d3e120 (R)
                                        ffffff37c8a2cc60 (R)
                                        ffffffda318fe3c0 (R)
                                        ffffff0fe0ebd400 (R)
                                        ffffff9c1bbe47c0 (R)
                                        ffffff7a644c0b00 (R)
                                        ffffff139a0b5420 (R)
                                        ffffff37c91a6c00 (R)
                                        ffffffb213bb34c0 (R)
                                        ffffff7e06188520 (R)
                                        ffffff37c8a45b80 (R)
                                        ffffff7fa34837a0 (R)
                                        ffffff9c1bbef860 (R)
                                        ffffff806d77cba0 (R)
                                        ffffffc09d00b0e0 (R)
                                        ffffff7e07708800 (R)
                                        fffffff43b336480 (R)
                                        ffffff46f7565ae0 (R)
                                        ffffff806d133120 (R)
                                        ffffffdecc6ac160 (R)
                                        fffffffb5637c000 (R)
                                        ffffff406a03c420 (R)
                                        ffffffda318fa400 (R)
                                        ffffff8fc56308c0 (R)
                                        ffffff7a633a5c60 (R)
                                        ffffffbeaeeb1420 (R)
                                        ffffffc9885624e0 (R)
                                        ffffff7e077054a0 (R)
                                        fffffffb56385440 (R)
                                        ffffffeefea617a0 (R)
                                        ffffff37c8a350c0 (R)
                                        fffffff43b35d8a0 (R)
                                        fffffff374d44100 (R)
                                        ffffffc09d4034a0 (R)
                                        ffffff7e076f83e0 (W)
                                        ffffffbb11aeb0c0 (R)
                                        ffffff0d422a2840 (R)

This is suspicious because acquiring an rwlock recursively as a reader
will induce deadlock if the WRITE_WANTED becomes set after the initial
acquisition but before the recursive acquisition. Of the stack traces
seen above, the getdents stack trace seems suspicious; does getdents
as called from ps acquire an rwlock recursively? This can be explored
with this D script:

#pragma D option quiet

getdents64:entry
/execname == "ps"/
{
        self->follow = 1;
        printf("%d: entry\\n", timestamp);
}

devzvol_readdir:entry
/self->follow/
{
        self->restore = 1;
        self->follow = 0;
}

devzvol_readdir:return
/self->restore/
{
        self->follow = 1;
}

rw-*
/self->follow/
{
        printf("%d: %s %p %d", timestamp, probename, arg0, arg1);
        stack();
    printf("\\n");
}

getdents64:return
/self->follow/
{
        printf("%d: return\\n", timestamp);
        self->follow = 0;
}

Here is a snippet of output from running the above D script and executing
a "ps" command:

346546016259112: entry
346546016259781: rw-acquire ffffff0e45220b58 1
              dev`sdev_rwlock+0x2c
              genunix`fop_rwlock+0x32
              genunix`getdents64+0xb4
              unix`sys_syscall+0x17a

346546016261451: rw-acquire ffffff0e45220b58 1
              dev`sdev_access+0x55
              genunix`fop_access+0x8c
              dev`sdev_readdir+0x4f
              genunix`fop_readdir+0xab
              genunix`getdents64+0xd1
              unix`sys_syscall+0x17a

346546016262967: rw-release ffffff0e45220b58 1
              dev`sdev_access+0x77
              genunix`fop_access+0x8c
              dev`sdev_readdir+0x4f
              genunix`fop_readdir+0xab
              genunix`getdents64+0xd1
              unix`sys_syscall+0x17a

346546016274039: rw-release ffffff0e45220b58 1
              dev`devname_readdir_func+0x1d3
              dev`sdev_readdir+0x7f
              genunix`fop_readdir+0xab
              genunix`getdents64+0xd1
              unix`sys_syscall+0x17a

346546016275246: rw-acquire ffffff0e45220b58 1
              dev`devname_readdir_func+0x20a
              dev`sdev_readdir+0x7f
              genunix`fop_readdir+0xab
              genunix`getdents64+0xd1
              unix`sys_syscall+0x17a

346546016288366: rw-release ffffff0e45220b58 1
              dev`sdev_rwunlock+0x25
              genunix`fop_rwunlock+0x2d
              genunix`getdents64+0xe1
              unix`sys_syscall+0x17a

346546016290422: return

This is the smoking gun: it shows the lock being acquired first as reader
to implement VOP_RWLOCK() (sdev_rwlock() returns with the sdev_contents
lock held as reader), and then again as part of the subsequent VOP_ACCESS()
in the sdev_readdir().

Given that the WRITE_WANTED on the lock was induced by a call to
devname_inactive_func(), and given that this will be called many times
for a single ps command, it should be remarkably easy to reproduce this –
and indeed, a simple test that launched 500 ps commands in parallel reproduced
it instantly.

Actions #1

Updated by Robert Mustacchi almost 9 years ago

  • Status changed from New to Resolved

Resolved in 8d3cb697ab50bcbe6fdf24afdb4243a7367d164e.

Actions

Also available in: Atom PDF