Actions
Bug #13374
closedPort L2ARC Improvements from OpenZFS
Start date:
Due date:
% Done:
100%
Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:
External Bug:
Description
After the initial port of persistent L2ARC, there's been several improvements we should port:
From da60484db5960f2b94cf3302dd08fe7fb673eaf3 Mon Sep 17 00:00:00 2001 From: George Amanakis <gamanakis@gmail.com> Date: Sat, 1 Aug 2020 14:17:18 -0400 Subject: [PATCH] Fix logging in l2arc_rebuild() In case the L2ARC rebuild was canceled, do not log to spa history log as the pool may be in the process of being removed and a panic may occur: BUG: kernel NULL pointer dereference, address: 0000000000000018 RIP: 0010:spa_history_log_internal+0xb1/0x120 [zfs] Call Trace: l2arc_rebuild+0x464/0x7c0 [zfs] l2arc_dev_rebuild_start+0x2d/0x130 [zfs] ? l2arc_rebuild+0x7c0/0x7c0 [zfs] thread_generic_wrapper+0x78/0xb0 [spl] kthread+0xfb/0x130 ? IS_ERR+0x10/0x10 [spl] ? kthread_park+0x90/0x90 ret_from_fork+0x35/0x40 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #10659
From fc34dfba8e8238683e90e3fa83d16be3343886f6 Mon Sep 17 00:00:00 2001 From: Allan Jude <allanjude@freebsd.org> Date: Fri, 14 Aug 2020 02:31:20 -0400 Subject: [PATCH] Fix L2ARC reads when compressed ARC disabled When reading compressed blocks from the L2ARC, with compressed ARC disabled, arc_hdr_size() returns LSIZE rather than PSIZE, but the actual read is PSIZE. This causes l2arc_read_done() to compare the checksum against the wrong size, resulting in checksum failure. This manifests as an increase in the kstat l2_cksum_bad and the read being retried from the main pool, making the L2ARC ineffective. Add new L2ARC tests with Compressed ARC enabled/disabled Blocks are handled differently depending on the state of the zfs_compressed_arc_enabled tunable. If a block is compressed on-disk, and compressed_arc is enabled: - the block is read from disk - It is NOT decompressed - It is added to the ARC in its compressed form - l2arc_write_buffers() may write it to the L2ARC (as is) - l2arc_read_done() compares the checksum to the BP (compressed) However, if compressed_arc is disabled: - the block is read from disk - It is decompressed - It is added to the ARC (uncompressed) - l2arc_write_buffers() will use l2arc_apply_transforms() to recompress the block, before writing it to the L2ARC - l2arc_read_done() compares the checksum to the BP (compressed) - l2arc_read_done() will use l2arc_untransform() to uncompress it This test writes out a test file to a pool consisting of one disk and one cache device, then randomly reads from it. Since the arc_max in the tests is low, this will feed the L2ARC, and result in reads from the L2ARC. We compare the value of the kstat l2_cksum_bad before and after to determine if any blocks failed to survive the trip through the L2ARC. Sponsored-by: The FreeBSD Foundation Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Allan Jude <allanjude@freebsd.org> Closes #10693
From 523e1295fe8fb15d37b2459ce48a32b9454d448d Mon Sep 17 00:00:00 2001 From: Alexander Motin <mav@FreeBSD.org> Date: Tue, 25 Aug 2020 17:33:36 -0400 Subject: [PATCH] Introduce limit on size of L2ARC headers Since L2ARC buffers are not evicted on memory pressure, too large amount of headers on system with irrationally large L2ARC can render it slow or even unusable. This change limits L2ARC writes and rebuild if unevictable L2ARC-only headers reach dangerous level. While there, call arc_adapt() on L2ARC rebuild, so that it could properly grow arc_c, reflecting potentially significant ARC size increase and avoiding slow growth with hopeless eviction attempts later when "overflow" is detected. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reported-by: Richard Elling <Richard.Elling@RichardElling.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Closes #10765
From feb3a7eef114b6d23bc1aaf5c550b2abeffaca94 Mon Sep 17 00:00:00 2001 From: George Amanakis <gamanakis@gmail.com> Date: Tue, 8 Sep 2020 14:44:37 -0400 Subject: [PATCH] Introduce ZFS module parameter l2arc_mfuonly In certain workloads it may be beneficial to reduce wear of L2ARC devices by not caching MRU metadata and data into L2ARC. This commit introduces a new tunable l2arc_mfuonly for this purpose. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #10710
From 085321621e79a75bea41c2b6511da6ebfbf2ba0a Mon Sep 17 00:00:00 2001 From: George Amanakis <gamanakis@gmail.com> Date: Mon, 14 Sep 2020 13:10:44 -0400 Subject: [PATCH] Add L2ARC arcstats for MFU/MRU buffers and buffer content type Currently the ARC state (MFU/MRU) of cached L2ARC buffer and their content type is unknown. Knowing this information may prove beneficial in adjusting the L2ARC caching policy. This commit adds L2ARC arcstats that display the aligned size (in bytes) of L2ARC buffers according to their content type (data/metadata) and according to their ARC state (MRU/MFU or prefetch). It also expands the existing evict_l2_eligible arcstat to differentiate between MFU and MRU buffers. L2ARC caches buffers from the MRU and MFU lists of ARC. Upon caching a buffer, its ARC state (MRU/MFU) is stored in the L2 header (b_arcs_state). The l2_m{f,r}u_asize arcstats reflect the aligned size (in bytes) of L2ARC buffers according to their ARC state (based on b_arcs_state). We also account for the case where an L2ARC and ARC cached MRU or MRU_ghost buffer transitions to MFU. The l2_prefetch_asize reflects the alinged size (in bytes) of L2ARC buffers that were cached while they had the prefetch flag set in ARC. This is dynamically updated as the prefetch flag of L2ARC buffers changes. When buffers are evicted from ARC, if they are determined to be L2ARC eligible then their logical size is recorded in evict_l2_eligible_m{r,f}u arcstats according to their ARC state upon eviction. Persistent L2ARC: When committing an L2ARC buffer to a log block (L2ARC metadata) its b_arcs_state and prefetch flag is also stored. If the buffer changes its arcstate or prefetch flag this is reflected in the above arcstats. However, the L2ARC metadata cannot currently be updated to reflect this change. Example: L2ARC caches an MRU buffer. L2ARC metadata and arcstats count this as an MRU buffer. The buffer transitions to MFU. The arcstats are updated to reflect this. Upon pool re-import or on/offlining the L2ARC device the arcstats are cleared and the buffer will now be counted as an MRU buffer, as the L2ARC metadata were not updated. Bug fix: - If l2arc_noprefetch is set, arc_read_done clears the L2CACHE flag of an ARC buffer. However, prefetches may be issued in a way that arc_read_done() is bypassed. Instead, move the related code in l2arc_write_eligible() to account for those cases too. Also add a test and update manpages for l2arc_mfuonly module parameter, and update the manpages and code comments for l2arc_noprefetch. Move persist_l2arc tests to l2arc. Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org> Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #10743
From af20b97078af7bf4ba7552dff04cc40b643ab72c Mon Sep 17 00:00:00 2001 From: Matthew Macy <mmacy@freebsd.org> Date: Sun, 27 Sep 2020 17:08:38 -0700 Subject: [PATCH] zfetch: Don't issue new streams when old have not completed The current dmu_zfetch code implicitly assumes that I/Os complete within min_sec_reap seconds. With async dmu and a readonly workload (and thus no exponential backoff in operations from the "write throttle") such as L2ARC rebuild it is possible to saturate the drives with I/O requests. These are then effectively compounded with prefetch requests. This change reference counts streams and prevents them from being recycled after their min_sec_reap timeout if they still have outstanding I/Os. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Matt Macy <mmacy@FreeBSD.org> Closes #10900
From a76e4e676180acad9e12f5a1e03f5ff48ec85983 Mon Sep 17 00:00:00 2001 From: George Amanakis <gamanakis@gmail.com> Date: Mon, 5 Oct 2020 18:29:05 -0400 Subject: [PATCH] Make L2ARC tests more robust Instead of relying on arbitrary timers after pool export/import or cache device off/online rely on arcstats. This makes the L2ARC tests more robust. Also cleanup some functions related to persistent L2ARC. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Adam Moss <c@yotes.com> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #10983
commit 4aafab91c5033e35217209d121f4c2fb83a8f690 Author: GeLiXin <ge.lixin@zte.com.cn> Date: Tue Nov 1 07:04:01 2016 +0800 Fix coverity defects: CID 147509 CID 147509: Explicit null dereferenced - l2arc_sublist_lock is fragile as relied on caller too much. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: GeLiXin <ge.lixin@zte.com.cn> Closes #5319
Related issues
Updated by Jason King about 2 years ago
Just to note, while the L2ARC tests are improved, they still can fail on occasion (and require re-running). I've talked to George Amanakis about this, and it just seems to be due to some of the inherent non-determinism of even a test system (due to all the other background stuff). Rather than continue to hold this up any further, since it should still be an improvement it seemed worth getting it out to review.
Updated by Jason King about 2 years ago
For testing, I ran the zfs test suite. The only failures were known failures with existing tickets.
Updated by Electric Monk about 2 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit 9e3493cb8a0cfe96c9aef9b7da42c6c9b5c24b43
commit 9e3493cb8a0cfe96c9aef9b7da42c6c9b5c24b43 Author: Jason King <jason.brian.king@gmail.com> Date: 2021-05-13T16:49:05.000Z 13374 Port L2ARC Improvements from OpenZFS Portions contributed by: George Amanakis <gamanakis@gmail.com> Portions contributed by: Allan Jude <allanjude@freebsd.org> Portions contributed by: Alexander Motin <mav@FreeBSD.org> Portions contributed by: Matthew Macy <mmacy@freebsd.org> Portions contributed by: GeLiXin <ge.lixin@zte.com.cn> Portions contributed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Adam Moss <c@yotes.com> Reviewed by: Richard Elling <Richard.Elling@RichardElling.com> Reviewed by: Ryan Moeller <freqlabs@FreeBSD.org> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Andy Fiddaman <andy@omnios.org> Reviewed by: Gordon Ross <gordon.w.ross@gmail.com> Approved by: Dan McDonald <danmcd@joyent.com>
Updated by Marcel Telka over 1 year ago
- Related to Bug #14375: dmu_zfetch: don't leak unreferenced stream when zfetch is freed added
Actions