Bug #7330
openreconsider retiring disks from live pools by fmd
0%
Description
In some cases (cf. #7327, but there are certainly more) fmd retires a disk from the system. If the zfs vdev has not enough redundancy, the whole zpool will become inaccessible (and worse, #7328 and #7329). Even if those bugs were not present, it is questionable whether retiring a last disk from a vdev and making all the pool inaccessible is the right way to solve disk problems. In fact, even when there is enough redundancy, this may lead to a loss of data. Imagine a RAID1 vdev with two disks: A and B. B has too many errors (let's say sectors X1, X2, .... Xn are unreadable) and gets retired from the system. Now let's say that A has an unreadable or wrong checksum sector Y (different from X1, X2, ..., Xn). If B were still present, one could repair Y on A by using the data from B. But since B was retired, how will one avoid losing the data?
In my opinion disk faults should therefore result in initiating the reconstruction to a hot spare and only after this is completed the disk should be retired (AFAICT, zpool replace does that properly: disconnects the disk being replaced from the pool only after the data has been copied to the new one. fmd could therefore initiate "zpool replace" instead of disabling all access to the disk immediately).
Related issues
Updated by Alek Pinchuk almost 7 years ago
- Related to Bug #7327: fmd should not retire a disk when its reference temperature is exceeded added
Updated by Andrew Stormont about 6 years ago
I think the right answer is to have better failure algorithms. If the disk looks like it's one the way out you want to start silvering the spare right then and there. Waiting until FMA retires the device is too late.
Updated by Chip Schweiss about 5 years ago
I have to agree with Pavel's analysis. FMD is too aggressive to retire disks.
Another example of this is when a disk throws a "predictive failure". FMD retires the disk immediately. This disk should be replaced gracefully, not offlined immediately as it still contains redundancy for the vdev.
A simple fix would be to make this a configurable so that disk retiring is an optional action.