Feature #3137
closedL2ARC compression
100%
Description
This issue ticket proposes adding transparent L2ARC compression to ZFS. The algorithm is hard-coded to be the new LZ4 compression algorithm, since L2ARC has very specific performance requirements (high decompression speed).
The feature is controlled via a new secondarycachecompress property on each dataset, allowing the user to selectively enable compression only on datasets which can use it.
Files
Updated by Sašo Kiselkov almost 11 years ago
Webrev at http://cr.illumos.org/~webrev/skiselkov/3137/
Source can be pulled from hg staging repo at http://62.65.188.67:8001/
Updated by Sašo Kiselkov over 10 years ago
- File l2arc_compress_test.ods l2arc_compress_test.ods added
Attached benchmark data for test on HP MicroServer N36L with 60GB of L2ARC on a single OCZ Vertex 3 SSD.
Updated by Richard Elling over 10 years ago
I don't think the proposed secondarycachecompress options are consistent with the other compression options. The proposal includes: all | none | metadata
I think the following list is more representative of actual use and more consistent with the other compression option:
off -- no L2ARC compression is used
on -- lz4 compression
lz4 -- lz4 compression
To confirm my suspicion that metadata is generally not large, do we know the distribution of metadata sizes in the ARC?
Updated by Sašo Kiselkov over 10 years ago
The proposed option format is supposed to be more in line with the values for "primarycache" and "secondarycache" rather than "compress". The reasons for the proposed values is as follows:
- none: no compression of l2arc
- all: all blocks of l2arc compressed
- data: only user-data compressed, metadata uncompressed (gives around 80-90% of maximum compression ratio while giving minimum read latencies on metadata-heavy operations, e.g. dedup).
- metadata: only metadata compressed - this option I'd be willing to drop, since it's there more for completeness' sake, rather than a specific performance tuning.
The l2arc is a special use-case of compression and as far as I've discussed it with Brendan Gregg, we agree that the algorithm itself shouldn't be made selectable. We know the general performance requirements of l2arc (speed over ratio), so we can pick the best algorithm for the user. Since the l2arc isn't persistent, the system can always just pick an algorithm that is best at the moment (perhaps even switching on the fly depending on workload).
Updated by Sašo Kiselkov over 10 years ago
- File l2arc_compress_bench.ods l2arc_compress_bench.ods added
Reworked webrev at http://cr.illumos.org/~webrev/skiselkov/3137/ to be based on the latest committed LZ4 patch and restructured feed thread locking to be non-blocking. Attached are benchmark numbers on bits spun off of this code.
Updated by Richard PALO over 10 years ago
Sašo Kiselkov wrote:
Reworked webrev at http://cr.illumos.org/~webrev/skiselkov/3137/ to be based on the latest committed LZ4 patch and restructured feed thread locking to be non-blocking. Attached are benchmark numbers on bits spun off of this code.
I'm curious if/where you're constrained a bit by physical memory and when it comes into play.
You're running only 8GB and start with a dataset size of 40GB.
It would be nice to have a dimension where you'd be able to test the range from extremely limited to extremely generous availability ... if your system is pegged, perhaps 2/4/8GB, but ideally 8/16/64/even..256GB .
I note as well that your system is a dual core... also out of curiosity what is the effect on your system load?
Updated by Sašo Kiselkov over 10 years ago
Richard PALO wrote:
Sašo Kiselkov wrote:
Reworked webrev at http://cr.illumos.org/~webrev/skiselkov/3137/ to be based on the latest committed LZ4 patch and restructured feed thread locking to be non-blocking. Attached are benchmark numbers on bits spun off of this code.
I'm curious if/where you're constrained a bit by physical memory and when it comes into play.
You're running only 8GB and start with a dataset size of 40GB.
Unfortunately I have no other machine to test this on, so I can't really experiment with more memory (though less is possible).
It would be nice to have a dimension where you'd be able to test the range from extremely limited to extremely generous availability ... if your system is pegged, perhaps 2/4/8GB, but ideally 8/16/64/even..256GB .
This test was specifically designed to test L2 cache effects and how compression and "virtual" expansion of the L2 cache can improve performance. As such, the dataset sizes were chosen because they were reasonable sweet spots compared to my L2 cache size. Having them as powers of two makes little to no difference - my L2 cache is 52GB (formatted capacity) anyway, and it's not the in-memory ARC I wanted to test.
I note as well that your system is a dual core... also out of curiosity what is the effect on your system load?
As I wrote my e-mail to the zfs mailing list, the lz4_compress 40GB and 80GB dataset tests were both pegged up against maximum CPU performance (i.e. busy 100%). The remainder were proportionally lower. In essence, you can calculate the CPU load by considering 350MB/s the maximum that lz4_decompress can deliver, and then calculate backwards based on the actual data throughput rate (i.e. 200MB/s / 350MB/s = 57% busy). The uncompressed tests did have lower CPU utilization rates, but I don't remember how high they were, and unfortunately I didn't record them.
Perhaps I'll rerun the benchmarks later, but now I have code cleanup issues I need to focus on.
Updated by Richard PALO over 10 years ago
sorry, i could have sworn I was seeing [L2]ARC...
The physical memory part is naturally targeting the ARC environmental impact and not L2ARC... although it does provide more complete info.
looking forward to integration of all this.
Updated by Sašo Kiselkov over 10 years ago
I've discovered a pretty huge bug in the way l2arc reads back zero-buffers (buffers squashed by compression to zero length). This updated webrev fixes this - running a full load test to check changes. Also rebased on latest #3035 mainline push, which significantly reduces the size of the webrev (#3035 changes no longer included, pull them from github).
http://cr.illumos.org/~webrev/skiselkov/3137/
Updated by Dan McDonald over 9 years ago
- Status changed from New to Resolved
commit aad02571bc59671aa3103bb070ae365f531b0b62
Author: Saso Kiselkov <skiselkov@gmail.com>
Date: Wed Jun 5 11:57:05 2013 -0400
3137 L2ARC compression
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@nexenta.com>