Project

General

Profile

Feature #9238

ZFS Spacemap Encoding V2

Added by Serapheim Dimitropoulos over 1 year ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Category:
-
Start date:
2018-03-05
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

The current space map encoding has the following disadvantages:
[1] Assuming 512 sector size each entry can represent at most 16MB for a segment.
This makes the encoding very inefficient for large regions of space.
[2] As vdev-wide space maps have started to be used by new features (i.e.
device removal, zpool checkpoint) we've started imposing limits in the
vdevs that can be used with them based on the maximum addressable offset
(currently 64PB for a top-level vdev).

The new spacemap encoding looks like this:


/*
 * debug entry
 *
 *     2     2        10                     50
 *  +-----+-----+---------------+----------------------------------+
 *  | 1 0  | act  |  syncpass |        txg (lower bits)           |
 *  +-----+-----+---------------+----------------------------------+
 *   63 62 61 60 59        50 49                                0
 *
 *
 * one-word entry
 *
 *    1               47                   1           15
 *  +-----------------------------------------------------------+
 *  | 0 |   offset (sm_shift units)    | type |    run       |
 *  +-----------------------------------------------------------+
 *   63  62                          16   15   14               0
 *
 *
 * two-word entry
 *
 *     2     2               36                      24
 *  +-----+-----+---------------------------+-------------------+
 *  | 1 1 | pad |            run                |        vdev        |
 *  +-----+-----+---------------------------+-------------------+
 *   63 62 61 60 59                       24 23                 0
 *
 *     1                            63
 *  +------+----------------------------------------------------+
 *  | type |                      offset                                 |
 *  +------+----------------------------------------------------+
 *     63   62                                                  0
 *
 * Note that a two-word entry will not strandle a block boundary.
 * If necessary, the last word of a block will be padded with a
 * debug entry (with act = syncpass = txg = 0).
 */

It remains backwards compatible with the old one. The introduced
two-word entry format, besides extending the limits imposed by the single-entry
layout, also includes a vdev field and some extra padding after its prefix.

The extra padding after the prefix should is reserved for future usage (e.g.
new prefixes for future encodings or new fields for flags). The new vdev field
not only makes the space maps more self-descriptive, but also opens the doors
for pool-wide space maps.

One final important note is that the number of bits used for vdevs is reduced
to 24 bits for blkptrs. That was decided as we don't know of any setups that
use more than 16M vdevs for the time being and
we wanted to fit the vdev field in the space map. In addition that gives us
some extra bits in dva_t.

History

#1

Updated by Electric Monk over 1 year ago

  • % Done changed from 0 to 100
  • Status changed from New to Closed

git commit 17f11284b49b98353b5119463254074fd9bc0a28

commit  17f11284b49b98353b5119463254074fd9bc0a28
Author: Serapheim Dimitropoulos <serapheim@delphix.com>
Date:   2018-04-02T16:16:05.000Z

    9238 ZFS Spacemap Encoding V2
    Reviewed by: Matt Ahrens <mahrens@delphix.com>
    Reviewed by: George Wilson <gwilson@zfsmail.com>
    Approved by: Gordon Ross <gwr@nexenta.com>

Also available in: Atom PDF