Bug #10592


misc. metaslab and vdev related ZoL bug fixes

Added by Jerry Jelinek about 5 years ago. Updated about 5 years ago.

zfs - Zettabyte File System
Start date:
Due date:
% Done:


Estimated time:
Gerrit CR:
External Bug:


This is a collection of recent fixes from ZoL:
8eef997679b Error path in metaslab_load_impl() forgets to drop ms_sync_lock
928e8ad47d3 Introduce auxiliary metaslab histograms
425d3237ee8 Get rid of space_map_update() for ms_synced_length
6c926f426a2 Simplify log vdev removal code
21e7cf5da89 zdb -L should skip leak detection altogether
df72b8bebe0 Rename range_tree_verify to range_tree_verify_not_present
75058f33034 Remove unused vdev_t fields

Actions #1

Updated by Joshua M. Clulow about 5 years ago

Notes from original commits

8eef99767 Error path in metaslab_load_impl() forgets to drop ms_sync_lock


Error path in metaslab_load_impl() forgets to drop ms_sync_lock

928e8ad47 Introduce auxiliary metaslab histograms


Introduce auxiliary metaslab histograms

This patch introduces 3 new histograms per metaslab. These
histograms track segments that have made it to the metaslab's
space map histogram (and are part of the spacemap) but have
not yet reached the ms_allocatable tree on loaded metaslab's
because these metaslab's are currently syncing and haven't
gone through metaslab_sync_done() yet.

The histograms help when we decide whether to load an unloaded
metaslab in-order to allocate from it. When calculating the
weight of an unloaded metaslab traditionally, we look at the
highest bucket of its spacemap's histogram.  The problem is
that we are not guaranteed to be able to allocated that
segment when we load the metaslab because it may still be at
the freeing, freed, or defer trees. The new histograms are
used when we try to calculate an unloaded metaslab's weight
to deal with this issue by removing segments that have would
not be in the allocatable tree at runtime. Note, that this
method of dealing with this is not completely accurate as
adjacent segments are not always consolidated in the space
map histogram of a metaslab.

In addition and to make things deterministic, we always reset
the weight of unloaded metaslabs based on their space map
weight (instead of doing that on a need basis). Thus, every
time a metaslab is loaded and its weight is reset again (from
the weight based on its space map to the one based on its
allocatable range tree) we expect (and assert) that this
change in weight can only get better if it doesn't stay the

425d3237e Get rid of space_map_update() for ms_synced_length


Get rid of space_map_update() for ms_synced_length

Initially, metaslabs and space maps used to be the same thing
in ZFS. Later, we started differentiating them by referring
to the space map as the on-disk state of the metaslab, making
the metaslab a higher-level concept that is metadata that deals
with space accounting. Today we've managed to split that code
furthermore, with the space map being its own on-disk data
structure used in areas of ZFS besides metaslabs (e.g. the
vdev-wide space maps used for zpool checkpoint or vdev removal

This patch refactors the space map code to further split the
space map code from the metaslab code. It does so by getting
rid of the idea that the space map can have a different in-core
and on-disk length (sm_length vs smp_length) which is something
that is only used for the metaslab code, and other consumers
of space maps just have to deal with. Instead, this patch
introduces changes that move the old in-core length of the
metaslab's space map to the metaslab structure itself (see
ms_synced_length field) while making the space map code only
care about the actual space map's length on-disk.

The result of this is that space map consumers no longer have
to deal with syncing two different lengths for the same
structure (e.g. space_map_update() goes away) while metaslab
specific behavior stays within the metaslab code. Specifically,
the ms_synced_length field keeps track of the amount of data
metaslab_load() can read from the metaslab's space map while
working concurrently with metaslab_sync() that may be
appending to that same space map.

As a side note, the patch also adds a few comments around
the metaslab code documenting some assumptions and expected

6c926f426 Simplify log vdev removal code


Simplify log vdev removal code

Get rid of the majority metaslab metadata when removing log vdevs
in spa_vdev_remove_log() with a call to metaslab_fini() instead
of duplicating a lot of that in vdev_remove_empty_log().

21e7cf5da zdb -L should skip leak detection altogether


zdb -L should skip leak detection altogether

Currently the point of -L option in zdb is to  disable leak
tracing and the loading of space maps because they are expensive,
yet still do leak detection in terms of space. Unfortunately,
there is a scenario where this is a lie. If we are using zdb -L
on a pool where a vdev is being removed, zdb_claim_removing()
will open the metaslab space maps of that device.

This patch makes it so zdb -L skips leak detection altogether
and ensures that no space maps are loaded.

df72b8beb Rename range_tree_verify to range_tree_verify_not_present


Rename range_tree_verify to range_tree_verify_not_present

The range_tree_verify function looks for a segment in a
range tree and panics if the segment is present on the
tree. This patch gives the function a more descriptive

75058f330 Remove unused vdev_t fields


Remove unused vdev_t fields

The following fields from the vdev_t struct are not used anywhere.
Actions #2

Updated by Electric Monk about 5 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit 555d674d5d4b8191dc83723188349d28278b2431

commit  555d674d5d4b8191dc83723188349d28278b2431
Author: Serapheim Dimitropoulos <>
Date:   2019-04-20T17:44:53.000Z

    10592 misc. metaslab and vdev related ZoL bug fixes
    Portions contributed by: Jerry Jelinek <>
    Reviewed by: Brian Behlendorf <>
    Reviewed by: Giuseppe Di Natale <>
    Reviewed by: George Melikov <>
    Reviewed by: Paul Dagnelie <>
    Reviewed by: Matt Ahrens <>
    Reviewed by: Pavel Zakharov <>
    Reviewed by: Tony Hutter <>
    Reviewed by: Kody Kantor <>
    Approved by: Dan McDonald <>


Also available in: Atom PDF