Project

General

Profile

Actions

Feature #16203

closed

zfs: switch refcount tracking from lists to AVL-trees

Added by Andy Fiddaman 4 months ago. Updated about 2 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
zfs - Zettabyte File System
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Bite-size
Tags:
Gerrit CR:
External Bug:

Description

Some work was done in openzfs to switch the debug refcount tracking to use AVL trees, with a massive speedup in using refcount tracking:

d057807ede05ce809e9ba1e2b47b12ada0d3b2ed

Author: Alexander Motin <mav@FreeBSD.org>
Date:   Wed Jun 14 11:02:27 2023 -0400

    Switch refcount tracking from lists to AVL-trees.

    With large number of tracked references list searches under the lock
    become too expensive, creating enormous lock contention.

    On my tests with ZFS_DEBUG enabled this increases write throughput
    with 32KB blocks from ~1.2GB/s to ~7.5GB/s.

We should also pull in the following commit as a pre-requisite:

e829a865bf6482a0d62913f528e516ce3b759d7b

Author: Alexander Motin <mav@FreeBSD.org>
Date:   Tue Aug 17 11:44:34 2021 -0400

    Use more atomics in refcounts

    Use atomic_load_64() for zfs_refcount_count() to prevent torn reads
    on 32-bit platforms.  On 64-bit ones it should not change anything.

    When built with ZFS_DEBUG but running without tracking enabled use
    atomics instead of mutexes same as for builds without ZFS_DEBUG.
    Since rc_tracked can't change live we can check it without lock.


Related issues

Related to illumos gate - Bug #16436: zfs reference tracking panic during zfs_receive_raw_incremental testClosedAndy Fiddaman

Actions
Actions #1

Updated by Andy Fiddaman about 2 months ago

  • Description updated (diff)
Actions #2

Updated by Electric Monk about 2 months ago

  • Gerrit CR set to 3396
Actions #3

Updated by Andy Fiddaman about 2 months ago

With this change in place (and with #16436 to fix a related bug), a full ZFS
testsuite run on a system with ZFS reference tracking enabled completes, and
takes around 5% longer than a DEBUG system on which is is not enabled.

Results Summary
PASS     1260
FAIL       8
SKIP      24
KILLED     1

Running Time:   04:39:41
Percent passed: 97.4%
Log directory:  /var/tmp/test_results/20240401T233017
Actions #4

Updated by Andy Fiddaman about 2 months ago

  • Related to Bug #16436: zfs reference tracking panic during zfs_receive_raw_incremental test added
Actions #5

Updated by Andy Fiddaman about 2 months ago

  • Gerrit CR deleted (3396)
Actions #6

Updated by Electric Monk about 2 months ago

  • Gerrit CR set to 3399
Actions #7

Updated by Andy Fiddaman about 2 months ago

As part of testing the fix for #16436, I ran the ZFS testsuite on the same VM as above but without this switch to AVL. It was significantly slower!

Running Time:   11:16:57
Actions #8

Updated by Andy Fiddaman about 2 months ago

I have also tested that the updated mdb plugin can properly enumerate references - this is how I debugged #16436 which was discovered during testing of this change.

Actions #9

Updated by Electric Monk about 2 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

git commit 9a8c5287524e3b4b2a0ef402601f72286e97547f

commit  9a8c5287524e3b4b2a0ef402601f72286e97547f
Author: Alexander Motin <mav@FreeBSD.org>
Date:   2024-04-09T15:21:32.000Z

    16203 zfs: switch refcount tracking from lists to AVL-trees
    Reviewed by: Andy Fiddaman <illumos@fiddaman.net>
    Reviewed by: Bill Sommerfeld <sommerfeld@hamachi.org>
    Reviewed by: Toomas Soome <tsoome@me.com>
    Approved by: Dan McDonald <danmcd@mnx.io>

Actions

Also available in: Atom PDF