scrub/resilver unnecessarily traverses snapshots created after the scrub started
If snapshots are being periodically created, scrub/resilver may never complete (or may take a long time to complete). The symptom is that the scrub gets "stuck" at 99+% done (according to "zpool status").
::zfs_dbgmsg reveals that we are traversing a snapshot that was created long after the scrub started. In particular, the scn_cur_min_txg is >= scn_cur_max_txg. This snapshot can't reference any blocks that need to be scrubbed, so we are just wasting time reading its metadata looking for blocks to scrub.
note: current date is Nov ~6 2015
scan: resilver in progress since Mon Sep 21 13:24:03 2015
99.8T scanned out of 100T at 10.8M/s, 15h3m to go
149G resilvered, 99.44% done
scanned dataset 751903 (pool/fs@2015-10-29) with min=12340101 max=11964992; pausing=1
Updated by Electric Monk over 3 years ago
- % Done changed from 0 to 100
- Status changed from New to Closed
commit 38d61036746e2273cc18f6698392e1e29f87d1bf Author: Matthew Ahrens <email@example.com> Date: 2016-01-24T03:13:55.000Z 6450 scrub/resilver unnecessarily traverses snapshots created after the scrub started Reviewed by: George Wilson <firstname.lastname@example.org> Reviewed by: Prakash Surya <email@example.com> Reviewed by: Richard Elling <Richard.Elling@RichardElling.com> Approved by: Richard Lowe <firstname.lastname@example.org>