misc. metaslab and vdev related ZoL bug fixes
This is a collection of recent fixes from ZoL:
8eef997679b Error path in metaslab_load_impl() forgets to drop ms_sync_lock
928e8ad47d3 Introduce auxiliary metaslab histograms
425d3237ee8 Get rid of space_map_update() for ms_synced_length
6c926f426a2 Simplify log vdev removal code
21e7cf5da89 zdb -L should skip leak detection altogether
df72b8bebe0 Rename range_tree_verify to range_tree_verify_not_present
75058f33034 Remove unused vdev_t fields
Updated by Joshua M. Clulow 6 months ago
Notes from original commits¶
8eef99767 Error path in metaslab_load_impl() forgets to drop ms_sync_lock¶
Error path in metaslab_load_impl() forgets to drop ms_sync_lock
928e8ad47 Introduce auxiliary metaslab histograms¶
Introduce auxiliary metaslab histograms This patch introduces 3 new histograms per metaslab. These histograms track segments that have made it to the metaslab's space map histogram (and are part of the spacemap) but have not yet reached the ms_allocatable tree on loaded metaslab's because these metaslab's are currently syncing and haven't gone through metaslab_sync_done() yet. The histograms help when we decide whether to load an unloaded metaslab in-order to allocate from it. When calculating the weight of an unloaded metaslab traditionally, we look at the highest bucket of its spacemap's histogram. The problem is that we are not guaranteed to be able to allocated that segment when we load the metaslab because it may still be at the freeing, freed, or defer trees. The new histograms are used when we try to calculate an unloaded metaslab's weight to deal with this issue by removing segments that have would not be in the allocatable tree at runtime. Note, that this method of dealing with this is not completely accurate as adjacent segments are not always consolidated in the space map histogram of a metaslab. In addition and to make things deterministic, we always reset the weight of unloaded metaslabs based on their space map weight (instead of doing that on a need basis). Thus, every time a metaslab is loaded and its weight is reset again (from the weight based on its space map to the one based on its allocatable range tree) we expect (and assert) that this change in weight can only get better if it doesn't stay the same.
425d3237e Get rid of space_map_update() for ms_synced_length¶
Get rid of space_map_update() for ms_synced_length Initially, metaslabs and space maps used to be the same thing in ZFS. Later, we started differentiating them by referring to the space map as the on-disk state of the metaslab, making the metaslab a higher-level concept that is metadata that deals with space accounting. Today we've managed to split that code furthermore, with the space map being its own on-disk data structure used in areas of ZFS besides metaslabs (e.g. the vdev-wide space maps used for zpool checkpoint or vdev removal features). This patch refactors the space map code to further split the space map code from the metaslab code. It does so by getting rid of the idea that the space map can have a different in-core and on-disk length (sm_length vs smp_length) which is something that is only used for the metaslab code, and other consumers of space maps just have to deal with. Instead, this patch introduces changes that move the old in-core length of the metaslab's space map to the metaslab structure itself (see ms_synced_length field) while making the space map code only care about the actual space map's length on-disk. The result of this is that space map consumers no longer have to deal with syncing two different lengths for the same structure (e.g. space_map_update() goes away) while metaslab specific behavior stays within the metaslab code. Specifically, the ms_synced_length field keeps track of the amount of data metaslab_load() can read from the metaslab's space map while working concurrently with metaslab_sync() that may be appending to that same space map. As a side note, the patch also adds a few comments around the metaslab code documenting some assumptions and expected behavior.
6c926f426 Simplify log vdev removal code¶
Simplify log vdev removal code Get rid of the majority metaslab metadata when removing log vdevs in spa_vdev_remove_log() with a call to metaslab_fini() instead of duplicating a lot of that in vdev_remove_empty_log().
21e7cf5da zdb -L should skip leak detection altogether¶
zdb -L should skip leak detection altogether Currently the point of -L option in zdb is to disable leak tracing and the loading of space maps because they are expensive, yet still do leak detection in terms of space. Unfortunately, there is a scenario where this is a lie. If we are using zdb -L on a pool where a vdev is being removed, zdb_claim_removing() will open the metaslab space maps of that device. This patch makes it so zdb -L skips leak detection altogether and ensures that no space maps are loaded.
df72b8beb Rename range_tree_verify to range_tree_verify_not_present¶
Rename range_tree_verify to range_tree_verify_not_present The range_tree_verify function looks for a segment in a range tree and panics if the segment is present on the tree. This patch gives the function a more descriptive name.
75058f330 Remove unused vdev_t fields¶
Remove unused vdev_t fields The following fields from the vdev_t struct are not used anywhere.
Updated by Electric Monk 5 months ago
- Status changed from New to Closed
- % Done changed from 0 to 100
commit 555d674d5d4b8191dc83723188349d28278b2431 Author: Serapheim Dimitropoulos <email@example.com> Date: 2019-04-20T17:44:53.000Z 10592 misc. metaslab and vdev related ZoL bug fixes Portions contributed by: Jerry Jelinek <firstname.lastname@example.org> Reviewed by: Brian Behlendorf <email@example.com> Reviewed by: Giuseppe Di Natale <firstname.lastname@example.org> Reviewed by: George Melikov <email@example.com> Reviewed by: Paul Dagnelie <firstname.lastname@example.org> Reviewed by: Matt Ahrens <email@example.com> Reviewed by: Pavel Zakharov <firstname.lastname@example.org> Reviewed by: Tony Hutter <email@example.com> Reviewed by: Kody Kantor <firstname.lastname@example.org> Approved by: Dan McDonald <email@example.com>