Bug #4322

ZFS deadlock on dp_config_rwlock

Added by Arne Jansen about 4 years ago. Updated over 3 years ago.

Status:NewStart date:2013-11-13
Priority:UrgentDue date:
Assignee:-% Done:

0%

Category:zfs - Zettabyte File System
Target version:-
Difficulty:Medium Tags:needs-triage

Description

We encountered a deadlocked zfs and have been able to take a crash dump. From all threads the 3 below seem to be the most interesting:

@stack pointer for thread ffffff001f470c40: ffffff001f470ad0
[ ffffff001f470ad0 _resume_from_idle+0xf1() ]
ffffff001f470b00 swtch+0x141()
ffffff001f470b40 cv_wait+0x70(ffffff04eb2f68d4, ffffff04eb2f6898)
ffffff001f470b90 txg_thread_wait+0xaf(ffffff04eb2f6890, ffffff001f470bc0, ffffff04eb2f68d4, 0)
ffffff001f470c20 txg_quiesce_thread+0x106(ffffff04eb2f6700)
ffffff001f470c30 thread_start+8()

stack pointer for thread ffffff0020d7ec40: ffffff0020d7e990
[ ffffff0020d7e990 _resume_from_idle+0xf1() ]
ffffff0020d7e9c0 swtch+0x141()
ffffff0020d7ea00 cv_wait+0x70(ffffff04eb2f69c0, ffffff04eb2f69b8)
ffffff0020d7ea30 rrw_enter_write+0x3a(ffffff04eb2f69b8)
ffffff0020d7ea60 rrw_enter+0x1d(ffffff04eb2f69b8, 0, fffffffff7a52500)
ffffff0020d7eaa0 spa_sync_upgrades+0x45(ffffff04f4a79b00, ffffff088f6e0510)
ffffff0020d7eb70 spa_sync+0x6a6(ffffff04f4a79b00, 16f5a7)
ffffff0020d7ec20 txg_sync_thread+0x207(ffffff04eb2f6700)
ffffff0020d7ec30 thread_start+8()

stack pointer for thread ffffff04eb7ba3a0: ffffff002565e540
[ ffffff002565e540 _resume_from_idle+0xf1() ]
ffffff002565e570 swtch+0x141()
ffffff002565e5b0 cv_wait+0x70(ffffff04eb2f68d6, ffffff04eb2f6898)
ffffff002565e600 txg_wait_open+0x83(ffffff04eb2f6700, 16f5aa)
ffffff002565e640 dmu_tx_wait+0xdd(ffffff086b210288)
ffffff002565e680 dmu_tx_assign+0x45(ffffff086b210288, 1)
ffffff002565e6d0 zfs_rmnode+0x119(ffffff184fd4c600)
ffffff002565e710 zfs_zinactive+0xe8(ffffff184fd4c600)
ffffff002565e770 zfs_inactive+0x75(ffffff0877c55d00, ffffffc312030228, 0)
ffffff002565e7d0 fop_inactive+0x76(ffffff0877c55d00, ffffffc312030228, 0)
ffffff002565e800 vn_rele_dnlc+0xa2(ffffff0877c55d00)
ffffff002565e8c0 dnlc_purge_vfsp+0x19a(ffffff1785e74d38, 0)
ffffff002565e920 zfs_umount+0x77(ffffff0c3cda7038, 400, ffffff04e61dcdb0)
ffffff002565e950 fsop_unmount+0x1b(ffffff0c3cda7038, 400, ffffff04e61dcdb0)
ffffff002565e9a0 dounmount+0x57(ffffff0c3cda7038, 400, ffffff04e61dcdb0)
ffffff002565e9d0 zfs_unmount_snap+0x63(ffffff002565e9f0)
ffffff002565eb70 dsl_dataset_user_release_impl+0xf8(ffffff08484cf318, 0, ffffff04eb2f6700)
ffffff002565eb90 dsl_dataset_user_release_tmp+0x1d(ffffff04eb2f6700, ffffff08484cf318)
ffffff002565ebd0 dsl_dataset_user_release_onexit+0x8b(ffffff0889f0d680)
ffffff002565ec10 zfs_onexit_destroy+0x43(ffffff34a83b0838)
ffffff002565ec40 zfs_ctldev_destroy+0x18(ffffff34a83b0838, 8aca)
ffffff002565eca0 zfsdev_close+0x89(6000008aca, 2403, 2, ffffffc312030228)
ffffff002565ecd0 dev_close+0x31(6000008aca, 2403, 2, ffffffc312030228)
ffffff002565ed20 device_close+0xd8(ffffff04f1b11200, 2403, ffffffc312030228)
ffffff002565edb0 spec_close+0x17b(ffffff04f1b11200, 2403, 1, 0, ffffffc312030228, 0)
ffffff002565ee30 fop_close+0x61(ffffff04f1b11200, 2403, 1, 0, ffffffc312030228, 0)
ffffff002565ee70 closef+0x5e(ffffff04f7349f68)
ffffff002565eee0 closeandsetf+0x398(d, 0)
ffffff002565ef00 close+0x13(d)
ffffff002565ef10 sys_syscall+0x17a()@

dsl_dataset_user_release_impl holds a read lock on dp_config_rwlock. Further down the stack, it blocks on txg_wait_open, which waits on the txg_sync_thread to finish its work. The sync thread in turn blocks in spa_sync_upgrades, waiting to get a write lock on dp_config_rwlock. --> deadlock.

If there are further questions, we keep the crashdump around.

I set the Priority to urgent, as it's an issue we see regularly in production.

History

#1 Updated by Matthew Ahrens about 4 years ago

  • Difficulty changed from Hard to Medium
  • Assignee deleted (Matthew Ahrens)
  • Category set to zfs - Zettabyte File System

The problem is that dsl_dataset_user_release_impl() calls zfs_unmount_snap() with the dsl_pool_config lock held. As documented in the comment above zfs_unmount_snap(), this is not allowed (and it has an assertion to this effect, so you must have hit this on nondebug bits).

The fix should be to only grab/release the dsl_pool_config lock around the call to dsl_dataset_hold_obj_string().

Bug was introduced by the following commit:

commit a7a845e4bf22fd1b2a284729ccd95c7370a0438c
Author: Steven Hartland <>
Date: Tue Jun 11 22:01:53 2013 -0800

3740 Poor ZFS send / receive performance due to snapshot hold / release processing
Reviewed by: Matthew Ahrens &lt;&gt;
Approved by: Christopher Siden &lt;&gt;

#2 Updated by Steven Hartland about 4 years ago

Have confirmed that if a snapshot is mounted then the assert is indeed hit on a zfs send:
panic: solaris assert: !dsl_pool_config_held(dmu_objset_pool(zfsvfs->z_os)), file: cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c, line: 3430

I've also confirmed that both a zfs send with and without a snapshot work with the dsl_pool_config lock held only around dsl_dataset_hold_obj_string.

Still currently manually reviewing the code paths of dsl_dataset_name and dsl_dataset_rele to confirm they don't need the lock as in a few places the lock is held around this group of statements instead of just dsl_dataset_hold_obj as one might expect from initial inspection.

Also available in: Atom