Bug #5347
closedidle pool may run itself out of space
100%
Description
After receiving an incremental send stream, an idle pool will slowly fill up.
If allowed to become completely full, the pool is unusable and can only be
imported readonly. Any write activity on the pool will clean up the extra
space.
steps to reproduce:
zpool create test c2t1d0
zfs create test/fs
zfs snapshot test/fs@a
zfs snapshot test/fs@b
zfs send test/fs@a | zfs recv test/recvd
zfs send -i @a test/fs@b | zfs recv test/recvd
observing the system after this:
zpool list test 5
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
test 7.94G 2.32M 7.94G - 0% 0% 1.00x ONLINE -
test 7.94G 3.09M 7.93G - 0% 0% 1.00x ONLINE -
test 7.94G 3.45M 7.93G - 0% 0% 1.00x ONLINE -
test 7.94G 4.20M 7.93G - 0% 0% 1.00x ONLINE -
test 7.94G 4.56M 7.93G - 0% 0% 1.00x ONLINE -
test 7.94G 5.20M 7.93G - 0% 0% 1.00x ONLINE -
Analysis:
The extra space is consumed by deferred frees (zdb -bb will tell you this).
If there there is an entry in the dp_bptree_obj (and therefore
SPA_FEATURE_ASYNC_DESTROY is active), but the tree is effectively empty
(there's nothing to traverse in it), then we could incorrectly set
scn_async_stalled, leading to this behavior.
spa_sync() will see that dsl_scan_active() is FALSE, and thus conclude that no
changes will be synced this txg, so we don't spa_sync_deferred_frees().
However, dsl_scan_sync() will see that scn_async_stalled is set, and therefore
try again. Though we don't actually try to process the bptree again, because
SPA_FEATURE_ASYNC_DESTROY is no longer active, we do bpobj_iterate() to process
potential background snapshot destroys. This always dirties the bpobj's bonus
buffer, even if nothing is actually changed. The dirty buffer then needs to be
written out.
Touching a file in the pool caused spa_sync_deferred_frees() to be called, thus
resetting the free space back to where it should be (but it the free space
continued reducing again after that).
In addition to fixing the scn_async_stalled issue, we should make the code in
spa_sync() that checks if this is a no-op TXG less fragile.
Updated by Electric Monk over 8 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit 231aab857f14a3e7a0ed5f2879399c3cd6ae92ea
commit 231aab857f14a3e7a0ed5f2879399c3cd6ae92ea Author: Matthew Ahrens <mahrens@delphix.com> Date: 2014-12-02T08:37:10.000Z 5347 idle pool may run itself out of space Reviewed by: Alex Reece <alex.reece@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Steven Hartland <killing@multiplay.co.uk> Reviewed by: Richard Elling <richard.elling@richardelling.com> Approved by: Dan McDonald <danmcd@omniti.com>