Bug #13389
closedAfter persistent L2ARC import, cache device has constant 8KB/sec load
100%
Description
I can't say for certain which revision caused this, but I'm seeing a constant 8KB/sec load on my L2ARC device in my pool, regardless of how idle it is.
capacity operations bandwidth pool alloc free read write read write ------------------------- ----- ----- ----- ----- ----- ----- rpool1 107G 115G 0 0 0 0 c8t5001B44F00970028d0s0 107G 115G 0 0 0 0 ------------------------- ----- ----- ----- ----- ----- ----- sqlpool 29.8G 192G 0 0 0 0 mirror 9.98G 64.0G 0 0 0 0 c8t50014EE6034B2CC7d0 - - 0 0 0 0 c8t50014EE0ABA3B6B9d0 - - 0 0 0 0 mirror 9.97G 64.0G 0 0 0 0 c5t4d0 - - 0 0 0 0 c5t5d0 - - 0 0 0 0 mirror 9.88G 64.1G 0 0 0 0 c5t1d0 - - 0 0 0 0 c8t50014EE057470E4Bd0 - - 0 0 0 0 logs - - - - - - c15tE4D25CFBDE0F0100d0 40K 54.5G 0 0 0 0 cache - - - - - - c5t2d0 4.98G 88.1G 0 0 0 7.95K ------------------------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool alloc free read write read write ------------------------- ----- ----- ----- ----- ----- ----- rpool1 107G 115G 0 0 0 0 c8t5001B44F00970028d0s0 107G 115G 0 0 0 0 ------------------------- ----- ----- ----- ----- ----- ----- sqlpool 29.8G 192G 0 0 0 0 mirror 9.98G 64.0G 0 0 0 0 c8t50014EE6034B2CC7d0 - - 0 0 0 0 c8t50014EE0ABA3B6B9d0 - - 0 0 0 0 mirror 9.97G 64.0G 0 0 0 0 c5t4d0 - - 0 0 0 0 c5t5d0 - - 0 0 0 0 mirror 9.88G 64.1G 0 0 0 0 c5t1d0 - - 0 0 0 0 c8t50014EE057470E4Bd0 - - 0 0 0 0 logs - - - - - - c15tE4D25CFBDE0F0100d0 40K 54.5G 0 0 0 0 cache - - - - - - c5t2d0 4.98G 88.1G 0 0 0 7.96K ------------------------- ----- ----- ----- ----- ----- -----
extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 1.6 0.0 113.3 0.0 0.0 0.0 0.2 0 0 c15tE4D25CFBDE0F0100d0 0.1 0.2 2.5 3.9 0.0 0.0 1.5 0.2 0 0 rpool1 0.0 1.1 0.1 45.2 0.0 0.0 0.0 0.3 0 0 c5t2d0 0.1 0.9 3.4 28.3 0.0 0.0 0.2 0.9 0 0 c5t4d0 0.1 0.9 3.6 28.3 0.0 0.0 0.1 0.9 0 0 c5t5d0 0.1 0.9 3.7 28.2 0.0 0.0 0.1 0.4 0 0 c5t1d0 0.1 0.9 3.6 28.2 0.0 0.0 0.0 1.0 0 0 c8t50014EE057470E4Bd0 0.1 0.9 3.4 28.2 0.0 0.0 0.0 1.0 0 0 c8t50014EE0ABA3B6B9d0 0.1 0.9 3.9 28.2 0.0 0.0 0.0 0.4 0 0 c8t50014EE6034B2CC7d0 0.1 0.2 2.5 3.9 0.0 0.0 0.0 0.2 0 0 c8t5001B44F00970028d0 0.6 8.1 21.8 327.8 0.1 0.0 11.1 0.7 0 0 sqlpool extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 1.0 0.0 8.0 0.0 0.0 0.0 0.2 0 0 c5t2d0 0.0 1.0 0.0 8.0 0.0 0.0 0.0 0.3 0 0 sqlpool extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 1.0 0.0 8.0 0.0 0.0 0.0 0.2 0 0 c5t2d0 0.0 1.0 0.0 8.0 0.0 0.0 0.0 0.2 0 0 sqlpool extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 1.0 0.0 8.0 0.0 0.0 0.0 0.2 0 0 c5t2d0 0.0 1.0 0.0 8.0 0.0 0.0 0.0 0.2 0 0 sqlpool extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 1.0 0.0 8.0 0.0 0.0 0.0 0.2 0 0 c5t2d0 0.0 1.0 0.0 8.0 0.0 0.0 0.0 0.3 0 0 sqlpool extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 1.0 0.0 8.0 0.0 0.0 0.0 0.2 0 0 c5t2d0 0.0 1.0 0.0 8.0 0.0 0.0 0.0 0.3 0 0 sqlpool
For the sake of not slowly killing the flash, it'd be nice to get to the bottom of this at some point (even if it is cheap TLC).
Files
Updated by Adam Stylinski over 2 years ago
Also should add, this is what's in /etc/system:
set zfs:zfs_arc_max = 1073741824
set zfs:zfs_prefetch_disable = 1
Updated by Gernot Strasser over 2 years ago
Updated by Adam Stylinski over 2 years ago
Yes, that seems like a safe bet. Can someone maybe merge that PR into Illumos-gate?
Updated by Dan McDonald over 2 years ago
Analysis from OpenZFS is here: https://github.com/openzfs/zfs/pull/11537
Updated by Dan McDonald over 2 years ago
Bug filer has confirmed no more 8kb/sec writes into L2ARC.
Running zfstests now to make sure there's no regression. Once that's confirmed, I'll RTI. This is one of the two test notes.
Updated by Dan McDonald over 2 years ago
- Category set to zfs - Zettabyte File System
- Assignee set to Dan McDonald
Updated by Adam Stylinski over 2 years ago
Dan mentioned it already but for posterity's sake - the patch removes the constant 8KB/sec write workload on the L2ARC for me.
Updated by Dan McDonald over 2 years ago
ZFS tests showed no regressions save one: persist_l2arc_008_pos.
Turns out that with this bug fixed, my lightweight test environment didn't generate the load needed to pass. Luckily for me Jason King found a rig that could: https://gist.github.com/jasonbking/97ac660daae43b4ce2bf58ce32e477ee
Updated by Jason King over 2 years ago
The persist_l2arc_008_pos
was failing on Dan's test system. Looking at the failure, as well as conferring with George Amankis (who wrote the tests, as well as did a lot of recent work w/ the L2ARC), we believe the failure is that the test does not always generate enough I/O to cause data to be pushed out to the ARC. When it doesn't, the test currently fails.
When I re-ran the persistent L2ARC tests on the test VM I use, they all passed -- including persist_l2arc_008_pos
. Since that one had failed, I also looked at the stdout from the test, and it all looks reasonable and as expected as best as I can tell.
Updated by Jason King over 2 years ago
Just for reference, I've also attached the same stdout passing output in case it's useful in the future.
Updated by Electric Monk over 2 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit b639505692797add0a13ee545fd6ca20d63f89fd
commit b639505692797add0a13ee545fd6ca20d63f89fd Author: George Amanakis <gamanakis@gmail.com> Date: 2021-02-23T00:42:23.000Z 13389 After persistent L2ARC import, cache device has constant 8KB/sec load Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Dan McDonald <danmcd@joyent.com> Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: Jason King <jbk@joyent.com> Approved by: Gordon Ross <gordon.w.ross@gmail.com>