Bug #9081

panic due to dirtying pool after the spa_final_txg during pool export/destroy

Added by Serapheim Dimitropoulos 11 days ago. Updated 11 days ago.

Status:NewStart date:2018-02-08
Priority:NormalDue date:
Assignee:Serapheim Dimitropoulos% Done:


Category:zfs - Zettabyte File System
Target version:-
Difficulty:Medium Tags:needs-triage


Igor Kozhukhov () hit a panic recently while trying to destroy a pool.
The relevant stacks can be found below:

assertion failed: tx->tx_txg <= spa_final_dirty_txg(os->os_spa) (0x13f <= 0x13e), file: ../../common/fs/zfs/dbuf.c, line: 1567

ffffd001ec60d560 vpanic()
ffffd001ec60d5f0 0xfffffffffbaae3f1()
ffffd001ec60d670 dbuf_dirty+0x3eb(ffffd0631e70a4b0, ffffd0631ba9c0c0)
ffffd001ec60d6b0 dmu_buf_will_dirty+0xcc(ffffd0631e70a4b0, ffffd0631ba9c0c0)
ffffd001ec60d740 zap_get_leaf_byblk+0xd5(ffffd065c650a100, 1, ffffd0631ba9c0c0, 0, ffffd001ec60d800)
ffffd001ec60d7c0 zap_deref_leaf+0xc3(ffffd065c650a100, 4359acc000000000, ffffd0631ba9c0c0, 0, ffffd001ec60d800)
ffffd001ec60d890 fzap_update+0x70(ffffd0631d2f3b40, 1, 1, fffffffff7ad6bf8, fffffffff7aed808, ffffd0631ba9c0c0)
ffffd001ec60d930 zap_update+0x158(ffffd065cc8c0b00, 67, ffffd001ec60d968, 1, 1, fffffffff7ad6bf8, ffffd0631ba9c0c0)
ffffd001ec60da00 sync_error_list+0xbc(ffffd065ccec7000, ffffd001ec60da48, ffffd065ccec78f8, ffffd0631ba9c0c0)
ffffd001ec60dab0 spa_errlog_sync+0xcf(ffffd065ccec7000, 13f)
ffffd001ec60db80 spa_sync+0x3cf(ffffd065ccec7000, 13f)
ffffd001ec60dc20 txg_sync_thread+0x1f7(ffffd065cc9c4740)
ffffd001ec60dc30 thread_start+8()

ffffd001ea0f8a50 swtch+0x141()
ffffd001ea0f8a80 cv_wait+0x66(ffffd065cc9c495a, ffffd065cc9c4918)
ffffd001ea0f8ad0 txg_wait_synced+0x73(ffffd065cc9c4740, 140)
ffffd001ea0f8b10 txg_sync_stop+0x33(ffffd065cc9c4740)
ffffd001ea0f8b50 spa_unload+0x2ec(ffffd065ccec7000)
ffffd001ea0f8bd0 spa_export_common+0x11e(ffffd066d71ed000, 2, 0, 0, 0)
ffffd001ea0f8bf0 spa_destroy+0x1a(ffffd066d71ed000)
ffffd001ea0f8c30 zfs_ioc_pool_destroy+0x1e(ffffd066d71ed000)
ffffd001ea0f8cd0 zfsdev_ioctl+0x524(8800000000, 5a01, ffffbf7fffdfa470, 202003, ffffd0631e54c7e8, ffffd001ea0f8e20)
ffffd001ea0f8d10 cdev_ioctl+0x25(8800000000, 5a01, ffffbf7fffdfa470, 202003, ffffd0631e54c7e8, ffffd001ea0f8e20)
ffffd001ea0f8d60 spec_ioctl+0x4d(ffffd063184de340, 5a01, ffffbf7fffdfa470, 202003, ffffd0631e54c7e8, ffffd001ea0f8e20, 0)
ffffd001ea0f8df0 fop_ioctl+0x55(ffffd063184de340, 5a01, ffffbf7fffdfa470, 202003, ffffd0631e54c7e8, ffffd001ea0f8e20, 0)
ffffd001ea0f8f00 ioctl+0xa4(3, 5a01, ffffbf7fffdfa470)
ffffd001ea0f8f10 sys_syscall+0x19f()

spa_final_txg (accessed by its wrapper spa_final_dirty_txg() above) is set during spa_export_common() to be the current
TXG plus 3 (2 for making sure that we have time to empty the defer frees plus 1 for the TXG that is potentially syncing now).
spa_export_comon() is used for both exporting and destroying the pools. After setting spa_final_txg the function calls
spa_unload() which among other things calls txg_sync_stop() (which is also one of the relevant stacks above). In the
stack above txg_sync_stop() waits for two TXGs to be synced (so the metaslab defer trees are empty) and then sets a
flag to stop that the sync thread.

In the above case we set spa_final_txg to be 318 and then later we were waiting in txg_sync_stop for txg 320 to finish.
Besides the fact that the above mechanism is a bit fragile as we can control the amount of TXGs that get pushed
between setting spa_final_txg and and calling txg_sync_stop(), there is also the interesting question of what got dirty
at that point which forced the panic.

From the stack above we see that ZFS was trying to log some errors on disk. Some preliminary look shows that a
lot of vdevs had an ERR_EXCEEDED status which might be the possible source of these errors.
ffffd065ccec7000 DESTROYED test2-1tb-old-target-con4

ffffd065c145e000 DEGRADED  -            root
ffffd065c6ad3800 DEGRADED - raidz
ffffd065cc50e800 DEGRADED ERR_EXCEEDED /dev/dsk/c4t33d1s0
ffffd065c3eb9800 HEALTHY - /dev/dsk/c4t33d2s0
ffffd065c6ab3000 HEALTHY - /dev/dsk/c4t33d3s0
ffffd065c6ab0800 HEALTHY - /dev/dsk/c4t33d4s0
ffffd065c6ab0000 HEALTHY - /dev/dsk/c4t34d1s0
ffffd065c6aad800 HEALTHY - /dev/dsk/c4t34d2s0
ffffd065c6aad000 DEGRADED ERR_EXCEEDED /dev/dsk/c4t34d3s0
ffffd065c6aeb800 DEGRADED ERR_EXCEEDED /dev/dsk/c4t34d4s0
ffffd065c6abb800 DEGRADED - raidz
ffffd065c6ab8800 DEGRADED ERR_EXCEEDED /dev/dsk/c4t33d5s0
ffffd065c6ab8000 HEALTHY - /dev/dsk/c4t33d6s0
ffffd065c6add800 HEALTHY - /dev/dsk/c4t33d7s0
ffffd065c6add000 HEALTHY - /dev/dsk/c4t33d8s0
ffffd065c6adb800 HEALTHY - /dev/dsk/c4t34d5s0
ffffd065c6adb000 HEALTHY - /dev/dsk/c4t34d6s0
ffffd065c6ad8800 DEGRADED ERR_EXCEEDED /dev/dsk/c4t34d7s0
ffffd065c6ad8000 DEGRADED ERR_EXCEEDED /dev/dsk/c4t34d8s0
ffffd0638aeb4000 DEGRADED - raidz
ffffd065c2ab5800 HEALTHY - /dev/dsk/c4t33d9s0
ffffd065c2ab6000 HEALTHY - /dev/dsk/c4t33d10s0
ffffd06791be9800 DEGRADED ERR_EXCEEDED /dev/dsk/c4t33d11s0
ffffd06791be9000 DEGRADED ERR_EXCEEDED /dev/dsk/c4t33d12s0
ffffd06791be7800 DEGRADED ERR_EXCEEDED /dev/dsk/c4t34d9s0
ffffd06791be7000 HEALTHY - /dev/dsk/c4t34d10s0
ffffd08912f8d800 HEALTHY - /dev/dsk/c4t34d11s0
ffffd08912f8d000 HEALTHY - /dev/dsk/c4t34d12s0
ffffd065c219b800 DEGRADED - raidz
ffffd065cc891800 HEALTHY - /dev/dsk/c4t33d13s0
ffffd089131f3800 HEALTHY - /dev/dsk/c4t33d14s0
ffffd089131f3000 HEALTHY - /dev/dsk/c4t33d15s0
ffffd089131f0800 DEGRADED ERR_EXCEEDED /dev/dsk/c4t33d16s0
ffffd089131f0000 DEGRADED ERR_EXCEEDED /dev/dsk/c4t34d13s0
ffffd089131ed800 HEALTHY - /dev/dsk/c4t34d14s0
ffffd089131ed000 HEALTHY - /dev/dsk/c4t34d15s0
ffffd089131eb800 HEALTHY - /dev/dsk/c4t34d16s0@


#1 Updated by Igor Kozhukhov 11 days ago

  • Assignee set to Serapheim Dimitropoulos
  • Category set to zfs - Zettabyte File System

Also available in: Atom