Allow tuning resilver priority
I recently had an issue with this system. It's got 11 VDEVs with 7 drives each (Hitachi HDS723020BLA642) in RAIDz2, no L2ARC or SLOG. The system lost a drive and just before it was replaced, a rather huge rsync job was started (moving some 27TiB). The drive was replaced Friday afternoon, and Sunday it was still resilvering. The pool has a total of ~100TiB, and 40TiB is in use. After stopping the rsync, the resilver finished in very short time indeed.
So - with the current code, a resilver may take days even on a rather fast system if the system is somewhat loaded. It would be a very nice feature to allow the sysadmin to prioritise resilver over other data traffic, since for some systems, short resilver may be better valued than a more responsive system
Updated by Bryan Leaman over 9 years ago
You can try adjusting zfs_resilver_delay. The default is 2. Decreasing it to 1 or 0 should speed up the resilver at the expense of other I/O. YMMV, but a value of 1 worked well when I was resilvering a disk in a test system recently.
From a discussion thread earlier this year:
6. If the resilver is really slow, it may be due to a reduction in prioritization of the resilver due to other I/O going on: If there is other work going on, then you might be hitting the resilver throttle. By default, it will delay 2 clock ticks, if needed. It can be turned off temporarily using: echo zfs_resilver_delay/W0t0 | mdb -kw to return to normal: echo zfs_resilver_delay/W0t2 | mdb -kw