Bug #3805

arc shouldn't cache freed blocks

Added by Christopher Siden over 3 years ago. Updated over 3 years ago.

Status:ClosedStart date:2013-06-06
Priority:NormalDue date:
Assignee:Christopher Siden% Done:

100%

Category:zfs - Zettabyte File System
Target version:-
Difficulty:Medium Tags:needs-triage

Description

From Matt Ahrens's bug report at Delphix:

ZFS should proactively evict freed blocks from the cache.

On dcenter, we saw that we were caching ~256GB of metadata, while the pool only
had <4GB of metadata on disk.  We were wasting about half the system's RAM
(252GB) on blocks that have been freed.

Even though these freed blocks will never be used again, and thus will
eventually be evicted, this causes us to use memory inefficiently for 2
reasons:

1. A block that is freed has no chance of being accessed again, but will be
kept in memory preferentially to a block that was accessed before it (and is
thus older) but has not been freed and thus has at least some chance of being
accessed again.

2. We partition the ARC into several buckets:
user data that has been accessed only once (MRU)
metadata that has been accessed only once (MRU)
user data that has been accessed more than once (MFU)
metadata that has been accessed more than once (MFU)

The user data vs metadata split is somewhat arbitrary, and the primary control
on how much memory is used to cache data vs metadata is to simply try to keep
the proportion the same as it has been in the past (each bucket "evicts
against" itself).  The secondary control is to evict data before evicting
metadata.

Because of this bucketing, we may end up with one bucket mostly containing
freed blocks that are very old, while another bucket has more recently
accessed, still-allocated blocks.  Data in the useful bucket (with
still-allocated blocks) may be evicted in preference to data in the useless
bucket (with old, freed blocks).

On dcenter, we saw that the MFU metadata bucket was 230MB, while the MFU data
bucket was 27GB and the MRU metadata bucket was 256GB.  However, the vast
majority of data in the MRU metadata bucket (256GB) was freed blocks, and thus
useless.  Meanwhile, the MFU metadata bucket (230MB) was constantly evicting
useful blocks that will be soon needed.

The problem of cache segmentation is a larger problem that needs more
investigation.  However, if we stop caching freed blocks, it should reduce the
impact of this more fundamental issue.

History

#1 Updated by Christopher Siden over 3 years ago

  • Status changed from In Progress to Closed
commit 6e6d586
Author: Matthew Ahrens <mahrens@delphix.com>
Date:   Fri Jun 7 21:29:06 2013

    3805 arc shouldn't cache freed blocks
    Reviewed by: George Wilson <george.wilson@delphix.com>
    Reviewed by: Christopher Siden <christopher.siden@delphix.com>
    Reviewed by: Richard Elling <richard.elling@dey-sys.com>
    Reviewed by: Will Andrews <will@firepipe.net>
    Approved by: Dan McDonald <danmcd@nexenta.com>

Also available in: Atom