Project

General

Profile

Bug #3041

zfs: deadlock between dp->dp_config_rwlock and ds->ds_opening_lock

Added by Vitaliy Gusev about 8 years ago. Updated almost 7 years ago.

Status:
Closed
Priority:
High
Category:
zfs - Zettabyte File System
Start date:
2012-07-30
Due date:
% Done:

0%

Estimated time:
Difficulty:
Hard
Tags:
needs-triage
Gerrit CR:

Description

panic[cpu0]/thread=ffffff00d2bc17e0:
Deadlock: cycle in blocking chain

ffffff00024e5a00 genunix:turnstile_block+9f0 ()
ffffff00024e5a70 unix:rw_enter_sleep+224 ()
ffffff00024e5af0 zfs:dsl_prop_register+5b ()
ffffff00024e5b80 zfs:dmu_objset_open_impl+2af ()
ffffff00024e5bd0 zfs:dmu_objset_from_ds+68 ()
ffffff00024e5c30 zfs:zfs_ioc_snapshot_list_next+151 ()
ffffff00024e5cb0 zfs:zfsdev_ioctl+183 ()
ffffff00024e5cf0 genunix:cdev_ioctl+45 ()
ffffff00024e5d30 specfs:spec_ioctl+5a ()
ffffff00024e5db0 genunix:fop_ioctl+7b ()
ffffff00024e5eb0 genunix:ioctl+18e ()
ffffff00024e5f00 unix:brand_sys_sysenter+2b7 ()

"::stack" shows address of rwlock - rw_enter_sleep+0x224(ffffff00c0cbba08, 1)

ffffff00c0cbba08::rwlock
ADDR OWNER/COUNT FLAGS WAITERS
ffffff00c0cbba08 ffffff0003222c40 B101 ffffff00d2bc17e0 (R)

ffffff00d2bc17e0 - it is panic thread.

ffffff0003222c40::findstack

stack pointer for thread ffffff0003222c40: ffffff00032227c0
[ ffffff00032227c0 _resume_from_idle+0xf1() ]
ffffff00032227f0 swtch+0x1e6()
ffffff0003222890 turnstile_block+0x854()
ffffff0003222900 mutex_vector_enter+0x2a4()
ffffff0003222950 dmu_objset_from_ds+0x34()
ffffff0003222990 dsl_dataset_modified_since_lastsnap+0x88()
ffffff00032229f0 recv_existing_check+0x36()
ffffff0003222a40 dsl_sync_task_group_sync+0xb9()
ffffff0003222ac0 dsl_pool_sync+0x221()
ffffff0003222b80 spa_sync+0x3a2()
ffffff0003222c20 txg_sync_thread+0x2c4()
ffffff0003222c30 thread_start+8()


Next Code brings revers order for taking those locks:

dsl_prop_register(dsl_dataset_t *ds, const char *propname,
dsl_prop_changed_cb_t *callback, void *cbarg) {
....
need_rwlock = !RW_WRITE_HELD(&dp->dp_config_rwlock);
if (need_rwlock)
rw_enter(&dp->dp_config_rwlock, RW_READER);
^^^^^

   commit: 2199:712a788c2dfd
                PSARC 2006/388 snapshot -r
                6373978 want to take lots of snapshots quickly ('zfs snapshot -r')

----

So one thread takes   { ds_opening_lock               ;  dp_config_rwlock on READ},
another thread takes  {dp_config_rwlock on WRITE;  ds_opening_lock }

History

#1

Updated by Vitaliy Gusev about 8 years ago

Reproducer: "zfs recv & zfs list"

#2

Updated by Matthew Ahrens about 8 years ago

  • Category set to zfs - Zettabyte File System
  • Status changed from New to In Progress
  • Assignee changed from Vitaliy Gusev to Matthew Ahrens

Thanks for reporting this deadlock. I am working on restructuring the way the dp_config_rwlock is used; I'll make sure this is fixed.

#3

Updated by Richard Laager almost 7 years ago

From IRC today:
(12:33:40) mahrens: oh, that one was fixed a while back, by the synctask restructuring work I did
(12:34:07) mahrens: I will have to close that (illumos bug 3041).

#4

Updated by Matthew Ahrens almost 7 years ago

  • Status changed from In Progress to Closed

Fixed by this commit:

commit 3b2aab18808792cbd248a12f1edf139b89833c13
Author: Matthew Ahrens <>
Date: Thu Feb 28 12:44:05 2013 -0800

3464 zfs synctask code needs restructuring
Reviewed by: Dan Kimmel &lt;&gt;
Reviewed by: Adam Leventhal &lt;&gt;
Reviewed by: George Wilson &lt;&gt;
Reviewed by: Christopher Siden &lt;&gt;
Approved by: Garrett D'Amore &lt;&gt;

Also available in: Atom PDF