Actions
Support #10566
closedMultiple DVA Scrubbing Fix
Start date:
2019-03-19
Due date:
% Done:
100%
Estimated time:
Tags:
Gerrit CR:
External Bug:
Description
ZoL PR 8453
Author: Tom Caputi <tcaputi@datto.com> Date: Fri Mar 15 17:14:31 2019 -0400 Multiple DVA Scrubbing Fix Currently, there is an issue in the sequential scrub code which prevents self healing from working in some cases. The scrub code will split up all DVA copies of a bp and issue each of them separately. The problem is that, since each of the DVAs is no longer associated with the others, the self healing code doesn't have the opportunity to repair problems that show up in one of the DVAs with the data from the others. This patch fixes this issue by ensuring that all IOs issued by the sequential scrub code include all DVAs. Initially, only the first DVA of each is attempted. If an issue arises, the IO is retried with all available copies, giving the self healing code a chance to correct the issue. To test this change, this patch also adds the ability for zinject to specify individual DVAs to inject read errors into. We then add a new test case that utilizes this functionality to ensure scrubs and self-healing reads can handle and transparently fix issues with individual copies of blocks.
This update is followup on #10405
While attempting to port this update, the following ZoL updates are included:
551905dd4 vdev_mirror: kstat observables for preferred vdev d6c6590c5 vdev_mirror: load balancing fixes 9f500936c FreeBSD r256956: Improve ZFS N-way mirror read performance by using load and locality information. fb40095f5 Disable LBA weighting on files and SSDs 4770aa064 Fix vdev_open_child() race on updating vdev_parent->vdev_nonrot 13d9a004f Fix taskq creation failure in vdev_open_children() 7dcd31883 Cleanup nits from ab7615d92 wait_scrubbed function from bf95a000c Add scrub after resilver zed script
Fixes from Jerry Jelinek:
vdev_mirror_map_init: spa_dsl_pool can be NULL. It seems this issue should be present in ZoL, but it is hidden by compiler. The problem was found when illumos was built with gcc 4, while gcc 7 was all ok. Apparently the gcc 7 did generate code where spa_dsl_pool was referenced late and since other confition(s) failed, we never step to this issue. vdev_queue_change_io_priority() needs to update spa_queued. The problem was revealed with DEBUG build. <pre>
Related issues
Updated by Joshua M. Clulow about 4 years ago
- Blocked by Feature #10405: Implement ZFS sorted scans added
Updated by Toomas Soome about 4 years ago
- Description updated (diff)
- Parent task set to #10809
Updated by Joshua M. Clulow about 4 years ago
- Blocked by Bug #10809: Performance optimization of AVL tree comparator functions added
Updated by Toomas Soome about 4 years ago
- Related to Bug #10900: Fix estimated scrub completion time added
Updated by Electric Monk about 4 years ago
- % Done changed from 90 to 100
git commit 12a8814c13fbb1d6d58616cf090ea5815dc107f9
commit 12a8814c13fbb1d6d58616cf090ea5815dc107f9 Author: Tom Caputi <tcaputi@datto.com> Date: 2019-05-13T20:49:15.000Z 10566 Multiple DVA Scrubbing Fix Portions contributed by: Toomas Soome <tsoome@me.com> Portions contributed by: Jerry Jelinek <jerry.jelinek@joyent.com> Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com> Approved by: Dan McDonald <danmcd@joyent.com>
Updated by Toomas Soome about 3 years ago
- Tracker changed from Bug to Support
- Status changed from In Progress to Closed
- Difficulty deleted (
Medium)
Actions