Bug #2669


L2ARC feed thread should use KM_PUSHPAGE for allocations

Added by Eric Schrock about 10 years ago. Updated about 10 years ago.

Start date:
Due date:
% Done:


Estimated time:
Gerrit CR:


From the downstream Linux port:

"There is potential for deadlock in the l2arc_feed thread if KM_PUSHPAGE is not used for the allocations made in l2arc_write_buffers.
Specifically, if KM_PUSHPAGE is not used for these allocations, it is possible for reclaim to be triggered which can cause the l2arc_feed thread to deadlock itself on the ARC_mru mutex."

From the ZFS-wg discussion:

Garret: "Sounds very very plausible, although I've not worked through this all. In general, for code this close to the memory manager (as ARC is), I think KM_PUSHPAGE is often the better choice.

For those reading but not familiar, KM_PUSHPAGE is intended for use by the block layer when a allocating a page that may ultimately be required to satisfy a request originated by the VM subsystem. If I recall correctly, these allocations come from a separate pre-allocated cache that is supposed to have much less activity and a greater chance of being satisfied without blocking."

George: "This seems like a reasonable thing to do. I think a future enhancement in this area would be to disable the l2arc feed thread if we're getting into such low memory conditions. We don't really want to consume all the reserved memory to write to the l2arc."

Brian: "Actually, I don't think you need this upstream. Although it certainly wouldn't be harmful and it would be nice to keep the code the same.

The deadlock is possible under Linux because of other modifications we've made to the ARC to better integrate it with Linux's VM. In
particular we've registered a shrinker function, arc_shrinker_func(), which gets called when the system is low on memory.

The callback may be run at any time under any sleeping memory allocation to reclaim memory. To avoid deadlocking here on the locks we're holding we pass KM_PUSHPAGE which maps to __GFP_NOFS on Linux. This has slightly different semantics under Linux than OpenSolaris but the gist is the same. In this case it disables the synchronous reclaim callback to avoid the deadlock."

Myself: "If it doesn't do harm and could be potentially useful, we could definitely use it upstream. We've wanted to make the illumos ZFS code more port-friendly in the past to keep it as the canonical source but need someone doing a port willing to help drive that effort. If George and Garrett are happy with it I can run it through the ZFS test suite (though I don't know it would really stress this path) and push it upstream."


Also available in: Atom PDF