zpool detach on spare fails
I have a ~100TB pool for Bacula backup storage, and after some disk issues, we had some bad drives plus a bad SAS expander, leading to some data corruption. When this happened, a spare took over for a failed drive in a VDEV. During this, the whole VDEV was marked as failed, because of the SAS expander. A cold reboot reset the SAS expander (to be replaced later), and except for the two files with corruption, things work well.
The zpool status now shows all VDEVs running ok, but one, raidz2-5, has all seven drives plus the spare http://paste.ubuntu.com/625800/. Trying to detach the spare, gives me
cannot detach c4t44d0: no valid replicas
According to Mark Musante, this is a known issue:
Yeah, this is a known problem. The DTL on the toplevel shows an outage, and is preventing the removal of the spare even though removing the spare won't make the outage worse.
Unfortunately, for opensolaris anyway, there is no workaround.
You could try doing a full scrub, replacing any disks that show errors, and waiting for the resilver to complete. That may clean up the DTL enough to detach the spare.
I haven't tried all of this, due to availability problems (server is placed off-site), but there is clearly a bug here.
Updated by Roy Sigurd Karlsbakk over 8 years ago
Roy Sigurd Karlsbakk wrote:
Resilver started and finished to the drive, so yes, I think it's supposed to be in use. Even so, I can't detach neither of the two (c4t5d0 or c4t44d0) - both give me an error message of "no valid replicas".
The problem persists. Has anything been changed in regard to this in 151?