10405 Implement ZFS sorted scans

Review Request #1226 - Created Oct. 8, 2018 and updated

Toomas Soome
This originated from ZFS On Linux, as https://github.com/zfsonlinux/zfs/commit/d4a72f23863382bdf6d0ae33196f5b5decbc48fd


During scans (scrubs or resilvers), it sorts the blocks in each transaction group by block offset; the result can be a significant improvement. (On my test system just now, which I put some effort to introduce fragmentation into the pool since I set it up yesterday, a scrub went from 1h2m to 33.5m with the changes.) I've seen similar rations on production systems.

FreeNAS has had these changes since Oct 2017.

Scrub & rebuild pools. Note times for performance analysis.

The pools are compatible with systems without the changes, so bouncing back and forth between two versions is possible, and I've used that for correctness-checking.

zpool scrub
zpool offline rpool disk and then later zpool online rpool disk
zpool replace

In my test systems the speedup on phydical disks is very noticeable, 4x 4TB raidz1 (7200RPM WD Black SATA), the scrub/resilver is down to 2 hours from 4 hours (3.5TB allocated data). The virtual disks under vmware fusion (on top of SSD) are not that drastic, but still significant and overall impression is consistent with notes from ZoL and FreeBSD.

Matthew Ahrens
Toomas Soome
Jerry Jelinek
Toomas Soome
Toomas Soome
Review request changed

Change Summary:

Rebase on large dnode update, to make sure we are on the same page.




Revision 3 (+3215 -860)

Show changes

Jerry Jelinek

I went back through my old review comments and cross checked them with the latest ZoL code. These are my comments that I still think we need to address:

5069 assign return value to 'rc'

2445 Why was this block added? It looks redundant with the existing block at 2406 and doesn't exist in ZoL.

99 s/limitted/limited/ this is spelled correctly in ZoL.
384 I'm not sure why the fill_weight code is duplicated here from scan_init, which matches ZoL. This function doesn't exists in ZoL. We would also need to remove call from spa_init.
1757 We'll leak bp_toread if dsl_scan_recurse returns an error. In ZoL, code does goto out.

136 Trailing whilespace, doesn't exist in ZoL.