Actions
Bug #3960
closeddsl_scan can skip over dedup-ed blocks if physical birth != logical birth
Start date:
2013-08-02
Due date:
% Done:
100%
Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:
External Bug:
Description
Analysis by George Wilson:
The dsl_scan code uses blk_birth directly which is fine for non-dedup-ed but for dedup-ed blocks it should be using BP_PHYSICAL_BIRTH() instead. Here's a typical error: Executing zdb -bccsv -U /rpool/tmp/zpool.cache ztest Traversing all blocks to verify checksums and verify nothing leaked ... zdb_blkptr_cb: Got error 50 reading <48, 55, 0, 4000000000b00> DVA[0]=<0:569b800:200> [L0 other uint64[]] sha256 uncompressed LE contiguous dedup single size=200L/200P birth=598L/590P fill=1 cksum=471be6558b665e4f:6dd49f1184814d14:91b0315d466beea7:68c153cc5500c836 -- skipping Error counts: errno count 50 1 Notice that the block as a logical birth of 598 but a physical birth of 590. The pool configuration looks like this: capacity operations bandwidth ---- errors ---- description used avail read write read write read write cksum ztest 21.8M 215M 27.9K 0 40.6M 0 0 0 2 /rpool/tmp/ztest.0b 21.8M 215M 27.9K 0 40.6M 0 0 0 2 Looking back through the logs we find that a scan was invoked at tag 596: txg 596 scan setup func=2 mintxg=3 maxtxg=596 doing scan sync txg 596; ddt bm=0/0/0/0 dsl_scan_ddt ddb_class 0 This would end up skipping blocks that are newer than txg 596 and thus skipping the block referenced in the checksum error.
Actions