Bug #5677
openzfs_setattr() can hang txg train
0%
Description
I was recently looking at a system where zfs would hang whenever one particular filesystem was mounted. Inspecting the zfs thread showed a stack like this:
swtch+0x141() cv_wait+0x70() txg_wait_synced+0x83() zil_replay_log_record+0xf3() zil_parse+0x21e() zil_replay+0xa7() zfsvfs_setup+0x10d() zfs_domount+0x1d7() zfs_mount+0x24f() fsop_mount+0x1e() domount+0x820() mount+0x167() syscall_ap+0x94() dtrace_systrace_syscall32+0xe4() _sys_sysenter_post_swapgs+0x149()
Doing a little dtracing, I was able to see that zfs_setattr() would get called as part of the zil replay logic and would create a transaction and assign it to a txg. Later it would get an error and execute the following:
static int zfs_setattr(vnode_t *vp, vattr_t *vap, int flags, cred_t *cr, caller_context_t *ct)<snip> if (err) { dmu_tx_abort(tx); if (err == ERESTART) goto top;
Later zfs_setattr() would return an error to zil_replay_log_record() resulting in the call to txg_wait_synced() and the stack that we see above.
The problem is that zfs_setattr() should only call dmu_tx_abort() if the call to dmu_tx_assign() had failed. In this case the call succeeded so a transaction was successfully assigned to a txg. Once this happens the transaction must be committed by calling dmu_tx_commit(). Failure to do so will cause the txg_quiesce_thread() to hang waiting for all transactions to commit and for zfs to just hang.
No data to display