A faulted pool with only unavailable vdevs triggers assertion failure in libzfs
How to reproduce:
1. Create pool over one file vdev:
root@danbuild:/root# mkfile 64m test root@danbuild:/root# zpool create test /root/test root@danbuild:/root# echo -n "" > test root@danbuild:/root# reboot -p
2. After reboot:
root@danbuild:/root# zpool list NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT rpool 15.9G 14.1G 1.79G - 59% 88% 1.00x ONLINE - test - - - - - - - FAULTED - root@danbuild:/root# zpool status test pool: test state: UNAVAIL status: One or more devices could not be used because the label is missing or invalid. There are insufficient replicas for the pool to continue functioning. action: Destroy and re-create the pool from a backup source. see: http://illumos.org/msg/ZFS-8000-5E scan: none requested config: NAME STATE READ WRITE CKSUM test UNAVAIL 0 0 0 insufficient replicas /root/test UNAVAIL 0 0 0 corrupted data # zpool get all tank NAME PROPERTY VALUE SOURCE test size - - test capacity - - test altroot - default test health FAULTED - test guid 3563445091455472388 - test version - - test bootfs - - test delegation - - test autoreplace - - test cachefile - default test failmode - - test listsnapshots - - test autoexpand - - test dedupditto - - test dedupratio - - test free - - test allocated - - test readonly - - test comment - default test expandsize - - test freeing - - test fragmentation - - test leaked - - Assertion failed: nvlist_lookup_nvlist(config, "feature_stats", &features) == 0, file ../common/libzfs_config.c, line 250, function zpool_get_features Abort (core dumped)
The problem is that we are asserting that we must have at least some features enabled. But that is not the case when the pool is faulted and we still have it in /etc/zfs/zpool.cache.
When SPA loads it from the cache, it initialises the features structure to NULL, but will never get a change to actually write anything (all vdevs are unavailable), so it leads to this situation.
The same issue appears when we run zpool upgrade related commands. I believe this is also the root cause for issue #5484.
Updated by Electric Monk over 5 years ago
- Status changed from New to Closed
- % Done changed from 50 to 100
commit b289d045e084af53efcc025255af8242e41f28fa Author: Dan Vatca <email@example.com> Date: 2015-12-02T19:56:44.000Z 6358 A faulted pool with only unavailable vdevs triggers assertion failure in libzfs Reviewed by: Matthew Ahrens <firstname.lastname@example.org> Reviewed by: Andrew Stormont <email@example.com> Reviewed by: Serban Maduta <firstname.lastname@example.org> Approved by: Dan McDonald <email@example.com>