Project

General

Profile

Actions

Bug #3939

closed

parent rn_lock being held across calls to rc_hold

Added by Robert Mustacchi almost 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Normal
Category:
cmd - userland programs
Start date:
2013-07-28
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

So we're looking at some kind of crazy back up in SMF land:

> ::stacks
THREAD   STATE    SOBJ        COUNT
e        PARKED   MX            648
         libc.so.1`mutex_lock_impl+0x291
         libc.so.1`mutex_lock+0x19
         rc_node_hold+0x14
         rc_node_assign+0x1e
         rc_local_scope+0x33
         entity_get+0x61
         simple_handler+0x55
         client_switcher+0x1df
         libc.so.1`__door_return+0x4b

3        PARKED   MX              6
         libc.so.1`mutex_lock_impl+0x291
         libc.so.1`mutex_lock+0x19
         rc_node_assign+0x33
         rc_node_clear+0x12
         entity_destroy+0x23
         entity_cleanup+0x35
         client_destroy+0x141
         client_switcher+0xbc
         libc.so.1`__door_return+0x4b

8        PARKED   MX              2
         libc.so.1`mutex_lock_impl+0x291
         libc.so.1`mutex_lock+0x19
         rc_node_hold+0x14
         rc_node_find_named_child+0xc7
         rc_node_get_child+0x5d
         entity_get_child+0x4b
         simple_handler+0x55
         client_switcher+0x1df
         libc.so.1`__door_return+0x4b

297      UNPARKED <NONE>          1

2        UNPARKED <NONE>          1
         libc.so.1`door_unref_func+0x4f
         libc.so.1`_thrp_setup+0x88
         libc.so.1`_lwp_start

1        UNPARKED <NONE>          1
         libc.so.1`sigwait+0x2d
         main+0x685
         _start+0x83

7        PARKED   CV              1
         libc.so.1`cond_wait_queue+0x6a
         libc.so.1`__cond_wait+0x8f
         libc.so.1`cond_wait+0x2e
         libc.so.1`pthread_cond_wait+0x24
         rc_node_hold_flag+0x79
         rc_node_no_client_refs+0x1ee
         rc_node_rele_locked+0xee
         rc_node_rele+0x1b
         rc_iter_next+0x450
         iter_read+0xb6
         simple_handler+0x55
         client_switcher+0x1df
         libc.so.1`__door_return+0x4b

5        PARKED   CV              1
         libc.so.1`cond_wait_queue+0x6a
         libc.so.1`__cond_wait+0x8f   
         libc.so.1`cond_wait+0x2e
         libc.so.1`pthread_cond_wait+0x24
         rc_node_wait_flag+0x5b
         rc_node_update+0xe9
         entity_update+0x52
         simple_handler+0x55
         client_switcher+0x1df
         libc.so.1`__door_return+0x4b

b0       PARKED   CV              1
         libc.so.1`cond_wait_queue+0x6a
         libc.so.1`cond_wait_common+0x27b
         libc.so.1`__cond_timedwait+0x111
         libc.so.1`cond_timedwait+0x35
         libumem.so.1`umem_update_thread+0x1f9
         libc.so.1`_thrp_setup+0x88
         libc.so.1`_lwp_start

9        PARKED   MX              1
         libc.so.1`mutex_lock_impl+0x291
         libc.so.1`mutex_lock+0x19
         rc_node_relink_child+0x43
         rc_tx_commit+0x732
         tx_commit+0xc0
         client_switcher+0x1df
         libc.so.1`__door_return+0x4b

Well, that's a lot of folks who are parked and blocked on a mutex.

If we take a random blocked thread from the first batch, say 0xe (our representitive thread in ::findstack), let's see who owns the lock he wants.

> 0xe::findstack -v
stack pointer for thread e: fdf42c2c
[ fdf42c2c libc.so.1`__lwp_park+0x19() ]
  fdf42c6c libc.so.1`mutex_lock_impl+0x291(82bcf74, 0, 0, fef54000, fedd6a40, 81f5714)
  fdf42c8c libc.so.1`mutex_lock+0x19(82bcf74, 0, fdf42d08, 827b5c0, 827b5f4, 0)
  fdf42cac rc_node_hold+0x14(82bcf00, 0, fdf42cfc, 827b5c0)
  fdf42cdc rc_node_assign+0x1e(827b5f4, 82bcf00, 82bbee4, 8063fce, fdf42df4, 81968f0)
  fdf42cfc rc_local_scope+0x33(1, 827b5f4, fdf42d4c, 81f5714, 8063fce, fdf42db0)
  fdf42d1c entity_get+0x61(81f5648, fdf42df4, 81968f0, 8064a83, 8196918, 8196918)
  fdf42d3c simple_handler+0x55(81f5648, fdf42df4, c, fdf42db0, fdf42dac, 8063fce)
  fdf42dcc client_switcher+0x1df(81, fdf42df4, c, 0, 0, 8064a96)
  00000000 libc.so.1`__door_return+0x4b()
> 82bcf74::print lwp_mutex_t data | ::print ulwp_t ul_lwpid
ul_lwpid = 0x8
> 0x8::findstack -v
stack pointer for thread 8: fe560b64
[ fe560b64 libc.so.1`__lwp_park+0x19() ]
  fe560ba4 libc.so.1`mutex_lock_impl+0x291(8185e9c, 0, fe560be4, 7306aa9a, 82bcf00, 1)
  fe560bc4 libc.so.1`mutex_lock+0x19(8185e9c, fe560d9c, 3, 8185e28, 8185e28, 82bcf00)
  fe560be4 rc_node_hold+0x14(8185e28, fe560d88, fe560c24, 82bcf00)
  fe560c24 rc_node_find_named_child+0xc7(82bcf00, fe560d88, 2, fe560c58, 80c75cc, 0)
  fe560c74 rc_node_get_child+0x5d(82606ec, fe560d88, 2, 8260e1c, 82163d4, 0)
  fe560ca4 entity_get_child+0x4b(8216308, fe560d7c, 82b9028, 8064a83, 82b9050, 82b9050)
  fe560cc4 simple_handler+0x55(8216308, fe560d7c, 84, fe560d38, fe560d34, 80640ad)
  fe560d54 client_switcher+0x1df(80, fe560d7c, 84, 0, 0, 8064a96)
  00000000 libc.so.1`__door_return+0x4b()
> 8185e9c::print lwp_mutex_t data | ::print ulwp_t ul_lwpid
ul_lwpid = 0xa
> 0xa::findstack -v
stack pointer for thread a: fe33eb64
[ fe33eb64 libc.so.1`__lwp_park+0x19() ]
  fe33eba4 libc.so.1`mutex_lock_impl+0x291(820de7c, 0, fe33ebe4, 806aa9a, 8185e28, 1)
  fe33ebc4 libc.so.1`mutex_lock+0x19(820de7c, fe33ed8f, 1, 820de08, 820de08, 8185e28)
  fe33ebe4 rc_node_hold+0x14(820de08, fe33ed88, fe33ec24, 8185e28)
  fe33ec24 rc_node_find_named_child+0xc7(8185e28, fe33ed88, 3, fe33ec58, 80c858c, 0)
  fe33ec74 rc_node_get_child+0x5d(827bc84, fe33ed88, 3, 827f95c, 8216654, 0)
  fe33eca4 entity_get_child+0x4b(8216588, fe33ed7c, 8196ad0, 8064a83, 8196af8, 8196af8)
  fe33ecc4 simple_handler+0x55(8216588, fe33ed7c, 84, fe33ed38, fe33ed34, 80640ad)
  fe33ed54 client_switcher+0x1df(7f, fe33ed7c, 84, 0, 0, 8064a96)
  00000000 libc.so.1`__door_return+0x4b()
> 820de7c::print lwp_mutex_t data | ::print ulwp_t ul_lwpid
ul_lwpid = 0x7
> 0x7::findstack -v
stack pointer for thread 7: fe65fb28
[ fe65fb28 libc.so.1`__lwp_park+0x19() ]
  fe65fb58 libc.so.1`cond_wait_queue+0x6a(8210e94, 8210e7c, 0, 8210e7c, 8210e08, 828fe18)
  fe65fb98 libc.so.1`__cond_wait+0x8f(8210e94, 8210e7c, fe65fbc8, feedad48, fef54000, 8210e7c)
  fe65fbb8 libc.so.1`cond_wait+0x2e(8210e94, 8210e7c, 8210e7c, 0, 80ca080, 8210e08)
  fe65fbd8 libc.so.1`pthread_cond_wait+0x24(8210e94, 8210e7c, fe65fc08, feedb8df, 8210e7c, 0)
  fe65fc08 rc_node_hold_flag+0x79(8210e08, 20, fe65fc48, feedad48)
  fe65fc48 rc_node_no_client_refs+0x1ee(828fe18, 820de08, fe65fc78, feedb8df)
  fe65fc78 rc_node_rele_locked+0xee(828fe18, 0, fe65fce8, 828fe18)
  fe65fc98 rc_node_rele+0x1b(828fe18, 4000, 8, 8267d78)
  fe65fce8 rc_iter_next+0x450(81687c8, 8267dac, 6, 0, 816a114, 0)
  fe65fd18 iter_read+0xb6(816a048, fe65fdf0, 8196df0, 8064a83, 8196e18, 8196e18)
  fe65fd38 simple_handler+0x55(816a048, fe65fdf0, 10, fe65fdac, fe65fda8, 8063766)
  fe65fdc8 client_switcher+0x1df(7d, fe65fdf0, 10, 0, 0, 8064a96)
  00000000 libc.so.1`__door_return+0x4b()

Well, thread 0x7 is definitely a bit suspicious here, it is holding some lock (that corresponds to some configd node) and is waiting for someone else to show up and take it. The second argument to rc_node_hold_flag tells us what kind of flag we are waiting to go away. Let's see where that is. If we look back at our tree we find thread 0x9 who is suspciously in a relink_child. If we look at the arguments in question, the node we are in rc_node_hold_flag, is the same one that we have in rc_node_relink_child as the ones that's dying. So who holds the lock that's blocking the transition:

> 0x08 + 0x74 = X
                7c              
> 820de7c::print lwp_mutex_t data | ::print ulwp_t ul_lwpid
ul_lwpid = 0x7

Uh oh. This node is nominally the parent node that contains the current one. The parent node has reasonable flags set, at least the ones that we assert should be there. So the real question is should this be happening and how did we get here?
Let's start back at rc_iter_next(). First, we want to see the state of the current rc_node_iter_t.

> 0x7::findstack -v
stack pointer for thread 7: fe65fb28
[ fe65fb28 libc.so.1`__lwp_park+0x19() ]
  fe65fb58 libc.so.1`cond_wait_queue+0x6a(8210e94, 8210e7c, 0, 8210e7c, 8210e08, 828fe18)
  fe65fb98 libc.so.1`__cond_wait+0x8f(8210e94, 8210e7c, fe65fbc8, feedad48, fef54000, 8210e7c)
  fe65fbb8 libc.so.1`cond_wait+0x2e(8210e94, 8210e7c, 8210e7c, 0, 80ca080, 8210e08)
  fe65fbd8 libc.so.1`pthread_cond_wait+0x24(8210e94, 8210e7c, fe65fc08, feedb8df, 8210e7c, 0)
  fe65fc08 rc_node_hold_flag+0x79(8210e08, 20, fe65fc48, feedad48)
  fe65fc48 rc_node_no_client_refs+0x1ee(828fe18, 820de08, fe65fc78, feedb8df)
  fe65fc78 rc_node_rele_locked+0xee(828fe18, 0, fe65fce8, 828fe18)
  fe65fc98 rc_node_rele+0x1b(828fe18, 4000, 8, 8267d78)
  fe65fce8 rc_iter_next+0x450(81687c8, 8267dac, 6, 0, 816a114, 0)
  fe65fd18 iter_read+0xb6(816a048, fe65fdf0, 8196df0, 8064a83, 8196e18, 8196e18)
  fe65fd38 simple_handler+0x55(816a048, fe65fdf0, 10, fe65fdac, fe65fda8, 8063766)
  fe65fdc8 client_switcher+0x1df(7d, fe65fdf0, 10, 0, 0, 8064a96)
  00000000 libc.so.1`__door_return+0x4b()
> 81687c8::print rc_node_iter_t
{
    rni_parent = 0x820de08
    rni_clevel = 0
    rni_iter_node = 0x820de08
    rni_iter = 0x818c268
    rni_type = 0x6
    rni_filter = rc_iter_filter_name
    rni_filter_arg = 0x826f028
    rni_offset = 0
    rni_last_offset = 0
}

If we look at the source itself, we always make sure that we hold a lock on the current level in the chain. Of interest is this little bit of code:

6379         if (iter->rni_clevel >= 0) {
6380                 /* Composed iterator.  Iterate over appropriate level. */
6381                 (void) pthread_mutex_unlock(&np->rn_lock);
6382                 np = np->rn_cchain[iter->rni_clevel];
6383                 /*
6384                  * If iter->rni_parent is an instance or a snapshot, np must
6385                  * be valid since iter holds iter->rni_parent & possible
6386                  * levels (service, instance, snaplevel) cannot be destroyed
6387                  * while rni_parent is held.  If iter->rni_parent is
6388                  * a composed property group then rc_node_setup_cpg() put
6389                  * a hold on np.
6390                  */
6391 
6392                 (void) pthread_mutex_lock(&np->rn_lock);
6393 
6394                 if (!rc_node_wait_flag(np, RC_NODE_CHILDREN_CHANGING)) {
6395                         (void) pthread_mutex_unlock(&np->rn_lock);
6396                         rc_node_clear(out, 1);
6397                         return (REP_PROTOCOL_FAIL_DELETED);
6398                 }
6399         }

Hwere we end up releasing the original node that we came in on and switched to another one off of the list. If we end up looking at what this node is specifically we find something interesting. The starting rc_node_t is the rni_parent field from the iter as per the first few lines of the function.

> 81687c8::print rc_node_iter_t rni_parent->rn_cchain[0]
rni_parent->rn_cchain[0] = 0x820de08

This is where the mutex_lock of the node comes from that we end up being deadlocked in. This leaves some questions for us. These include:

o Why do we need to hold this lock while we do the iteration?
o Should we be trying to grab the lock on the parent during tx_commit?
o When we're doing the rele in rc_iter_next, should we be releasing the lock on the parent when we release our hold on the interim node?

The answer to this is pretty straightforward. In a majority of other cases here we are releasing the rn_lock for the parent before we start doing reles. Unfortunately there are several cases where we don't do this – particularly the one that we came in on. We know that this is a safe operation to do because we do it in various places, we already have a hold on the rc node in question as per the block comment when we first grab the lock. As per the locking rules this means that we have to check if the rc_node_wait flag is set each time and error out appropriately when we try to regain the lock afterwards.

Actions #1

Updated by Robert Mustacchi almost 8 years ago

  • Status changed from New to Resolved
  • % Done changed from 90 to 100

Resolved in a29160b0f5f650ae34e2273cacdd3eff15c62fba.

Actions

Also available in: Atom PDF