Project

General

Profile

Bug #9700

ZFS resilvered mirror does not balance reads

Added by Jerry Jelinek about 1 year ago. Updated about 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
zfs - Zettabyte File System
Start date:
2018-08-03
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

Using DTrace we can see that there is no read activity on the resilvered side of a mirror. On a production system, here is the state of the "spare" device after the resilver has completed.

> ::spa -v
ADDR STATE NAME
ffffd0631e423000 ACTIVE zones
ADDR STATE AUX DESCRIPTION
ffffd0631b888000 DEGRADED - root
ffffd0631adac000 HEALTHY - mirror
ffffd0631b254000 HEALTHY - /dev/dsk/c1t5000CCA02D81375Dd0s0
ffffd0631a6c6000 HEALTHY - /dev/dsk/c1t5000CCA02D818EE5d0s0
ffffd0631b68e800 DEGRADED - mirror
ffffd0631b66e000 HEALTHY - /dev/dsk/c1t5000CCA02D81CCA5d0s0
ffffd08332c6a000 FAULTED - spare
ffffd0631c001000 FAULTED EXTERNAL /dev/dsk/c1t5000CCA02D81CD25d0s0
ffffd08882cbe000 HEALTHY - /dev/dsk/c1t5000CCA0807BE511d0s0
ffffd062a3d7d800 HEALTHY - mirror
ffffd0631c002000 HEALTHY - /dev/dsk/c1t5000CCA02D81D559d0s0
ffffd0631a66d000 HEALTHY - /dev/dsk/c1t5000CCA02D820875d0s0
ffffd0631a77b800 HEALTHY - mirror
ffffd0631a784000 HEALTHY - /dev/dsk/c1t5000CCA02D821B71d0s0
ffffd0631a83f000 HEALTHY - /dev/dsk/c1t5000CCA02D823861d0s0
ffffd0631b669800 HEALTHY - mirror
ffffd062e96ba000 HEALTHY - /dev/dsk/c1t5000CCA02D826469d0s0
ffffd0631bffb800 HEALTHY - /dev/dsk/c1t5000CCA02D8275B1d0s0
ffffd0631b368000 HEALTHY - mirror
ffffd062a3d83800 HEALTHY - /dev/dsk/c1t5000CCA02D827FF5d0s0
ffffd0631b369800 HEALTHY - /dev/dsk/c1t5000CCA02D830F1Dd0s0
ffffd0631a6d1000 HEALTHY - mirror
ffffd0631b368800 HEALTHY - /dev/dsk/c1t5000CCA02D8410C9d0s0
ffffd0631c44a000 HEALTHY - /dev/dsk/c1t5000CCA08077F1A9d0s0
ffffd0631a6d6000 HEALTHY - /dev/dsk/c1t5000CCA0496F7535d0s0
- - - spares
ffffd0631a7e2800 HEALTHY - /dev/dsk/c1t5000CCA0807BE511d0s0

Note that the spare vdev is still marked as FAULTED, so vdev_readable will always return false for the spare, even though there is a healthy spare and the resilver has completed.

History

#1

Updated by Jerry Jelinek about 1 year ago

Here is my proposed fix

--- a/usr/src/uts/common/fs/zfs/spa.c
+++ b/usr/src/uts/common/fs/zfs/spa.c
@@ -27,7 +27,7 @@
  * Copyright 2013 Saso Kiselkov. All rights reserved.
  * Copyright (c) 2014 Integros [integros.com]
  * Copyright 2016 Toomas Soome <tsoome@me.com>
- * Copyright 2017 Joyent, Inc.
+ * Copyright 2018 Joyent, Inc.
  * Copyright (c) 2017 Datto Inc.
  * Copyright 2018 OmniOS Community Edition (OmniOSce) Association.
  */
@@ -6535,6 +6535,7 @@ spa_vdev_resilver_done_hunt(vdev_t *vd)

        /*
         * Check for a completed resilver with the 'unspare' flag set.
+        * Also potentially update faulted state.
         */
        if (vd->vdev_ops == &vdev_spare_ops) {
                vdev_t *first = vd->vdev_child[0];
@@ -6557,6 +6558,26 @@ spa_vdev_resilver_done_hunt(vdev_t *vd)
                        return (oldvd);

                /*
+                * We know the vdev is a spare (e.g. "spare-1") which just
+                * finished resilvering. If it's faulted, and one of the
+                * children is healthy, then set the spare's state to degraded
+                * so that it will handle read operations.
+                */
+               if (vd->vdev_state == VDEV_STATE_FAULTED &&
+                   vd->vdev_children >= 2) {
+                       int i;
+
+                       for (i = 0; i < vd->vdev_children; i++) {
+                               if (vd->vdev_child[i]->vdev_state ==
+                                   VDEV_STATE_HEALTHY) {
+                                       vdev_set_state(vd, B_FALSE,
+                                           VDEV_STATE_DEGRADED, VDEV_AUX_NONE);
+                                       break;
+                               }
+                       }
+               }
+
+               /*
                 * If there are more than two spares attached to a disk,
                 * and those spares are not required, then we want to
                 * attempt to free them up now so that they can be used

For testing, I created a small zpool in a VM with 3 virtual disks (2 in the mirror, 1 spare). I used fminject to inject 11 IO errors onto one side of the mirror (zinject would also work). After the spare was brought in and resilvered, here is the resulting state

> ::spa -v
ADDR                 STATE NAME
fffffe090b6a4000    ACTIVE jjpool

    ADDR             STATE     AUX          DESCRIPTION
    fffffe090c7b4800 DEGRADED  -              mirror
    fffffe090c7b6800 HEALTHY   -                /dev/dsk/c1t2d0s0
    fffffe0d267ec800 DEGRADED  -                spare
    fffffe090c7b8800 FAULTED   ERR_EXCEEDED       /dev/dsk/c1t3d0s0
    fffffe0d26665800 HEALTHY   -                  /dev/dsk/c1t4d0s0

Using DTrace, I confirmed that read IO is now occurring on both c1t2d0 and c1t4d0.

#2

Updated by Electric Monk about 1 year ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit 82f63c3c2bf5e4378706e8dcfccf717d67371be9

commit  82f63c3c2bf5e4378706e8dcfccf717d67371be9
Author: Jerry Jelinek <jerry.jelinek@joyent.com>
Date:   2018-08-30T17:10:13.000Z

    9700 ZFS resilvered mirror does not balance reads
    Reviewed by: Toomas Soome <tsoome@me.com>
    Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
    Reviewed by: George Wilson <george.wilson@delphix.com>
    Approved by: Matthew Ahrens <mahrens@delphix.com>

Also available in: Atom PDF