Project

General

Profile

Bug #4890

ztest assertion: scn->scn_phys.scn_min_txg <= vdev_dtl_min(vd) in vdev_dtl_should_excise()

Added by Christopher Siden almost 6 years ago.

Status:
New
Priority:
Normal
Category:
zfs - Zettabyte File System
Start date:
2014-05-25
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

George Wilson:

Hit this while running ztest:

assertion failed for thread 0xfffffd7fe2bf6240, thread-id 537:
scn->scn_phys.scn_min_txg <= vdev_dtl_min(vd) (0x70f <= 0x3), file
../../../uts/common/fs/zfs/vdev.c, line 1731

libc.so.1`_lwp_kill+0xa()
libc.so.1`_assfail+0x179(fffffd7fd93d1660, fffffd7ffdca08a8, 6c3)
libc.so.1`assfail3+0xe6(fffffd7ffdca0e98, 70f, fffffd7ffdca769c, 3, 
fffffd7ffdca08a8, 6c3)
libzpool.so.1`vdev_dtl_should_excise+0x1a6(8af87c0)
libzpool.so.1`vdev_dtl_reassess+0x15a()
libzpool.so.1`vdev_dtl_reassess+0x7f()
libzpool.so.1`vdev_dtl_reassess+0x7f()
libzpool.so.1`dsl_scan_done+0x1c7(11a9ec0, 1, e144c0)
libzpool.so.1`dsl_scan_sync+0x31a(10a36c0, e144c0)
libzpool.so.1`spa_sync+0x337(d91000, 717)
libzpool.so.1`txg_sync_thread+0x273(10a36c0)
libc.so.1`_thrp_setup+0x8a(fffffd7fe2bf6240)
libc.so.1`_lwp_start()

The current scan is only resilvering a few txgs [70f, 711] but yet this vdev
has a min txg of  3. The problem is that this vdev is currently not readable
and as a result when the scan that was doing the resilver it actually finished
but didn't copy any of the data to this device. 

Now a second scan comes through and the device is still offline (ie. not
readable) so once again this device was did not have any data copied over to
it. This time when we check if we should excise the DTLs from this device we
determine we should since the scan is for a txg much higher than the max value
in this device's dtl range but we end up tripping over this assertion:

        /*
         * When a resilver is initiated the scan will assign the scn_max_txg
         * value to the highest txg value that exists in all DTLs. If this
         * device's max DTL is not part of this scan (i.e. it is not in
         * the range (scn_min_txg, scn_max_txg] then it is not eligible
         * for excision.
         */
        if (vdev_dtl_max(vd) <= scn->scn_phys.scn_max_txg) {
                ASSERT3U(scn->scn_phys.scn_min_txg, <=, vdev_dtl_min(vd));

If the device is not readable than we don't want to ever excise any of its dtls
so we should return B_FALSE and not even bother with anything further.

Also available in: Atom PDF