Project

General

Profile

Feature #6400

ZFS smart compression

Added by Sašo Kiselkov about 5 years ago. Updated almost 5 years ago.

Status:
New
Priority:
Normal
Category:
zfs - Zettabyte File System
Start date:
2015-10-27
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

ZFS smart compression is a feature developed at Nexenta that automatically tracks per-file compression ratios and disables compression if a file is consistently getting too low a compression ratio to warrant further attempts compressing it. The will periodically recheck its previous conclusions about the (in)compressibility of a file using a progressive weighting algorithm. The more compression succeeds, the more hesitant the algorithm is to stop attempting it in case a short incompressible transient is encountered and conversely, the more compression attempts fail, the less frequent the algorithm rechecks (up to some meaningful limits).

#1

Updated by Nikola M. almost 5 years ago

I would like to know some info on how to enable or disable this option form admin perspective in zfs dataset operations
and if code is linked to feature, how to build and test it in general, etc.

#2

Updated by Sašo Kiselkov almost 5 years ago

Nikola M. wrote:

I would like to know some info on how to enable or disable this option form admin perspective in zfs dataset operations
and if code is linked to feature, how to build and test it in general, etc.

This functionality is controlled on a per-filesystem basis using the "smartcompression" property (defaults to "on"). Please note, I said "filesystem" not "dataset"; this feature only works on filesystems (because it's per-file based). It does not introduce any new on-disk format changes, so there are not zpool features that need to be turned on/off. A pool can be seamlessly moved between systems with smart compression and systems without.
To build the changes, see my code review with the patch included at https://reviews.csiden.org/r/266/ - this code can simply be applied on top of illumos master and you're good to go. Then just reboot into the new kernel and it's there.
To actually see a performance boost is a little bit more complicated. You need a heavy compression algorithm (say, gzip) that responds badly to incompressible data and a partially compressible or completely incompressible workload (e.g. best achieved through a heterogeneous mixture of files). Then the gains become most apparent.

Also available in: Atom PDF