installctx() blocking allocate causes problems
Some callers of installctx() call it while inside a kpreempt_disable()/kpreempt_enable() block. This often has to do with needing to NOT have the current thread swap out until ALL of the work related to installctx() is done.
The problem is: installctx() has two portions:
1.) SLEEPING/BLOCKING allocation of "struct ctxop".
2.) Installation of the allocated ctxop, which because of #12478 is encapsulated in a kpreempt_disable/kpreempt_enable block.
Some callers MUST install the context, perform additional state changes, and ONLY THEN allow preemption to resume. An example of this case is #13902, but there are other callers that also wrap a call to installctx() inside a kpreempt_disable/kpreempt_enable block. The problem with those other callers is that if part 1 of installctx (the blocking allocation) actually blocks, then the thread locks or some other assertion is tripped.
So if we have an installctx() caller that is not in a kpreempt_disable section, we must behave as we did before lest we break #12478. If we have an installctx() caller that, for whatever reason, MUST itself be in a kpreempt_disable block that includes an installctx() call, we must NOT BLOCK, and therefore must find a different way to allocate "struct ctxop".
While finding solutions to #13902, I've proposed a new call: installctx_preallocate(), which MUST be called from outside any kpreempt_disable() block, but allows callers of installctx() which must have kernel preemption to allocate, then disable preemption. It also adds another parameter to installctx(), which is NULL if we want installctx() to allocate, or non-NULL if we want installctx to take a buffer from installctx_preallocate.
The webrev https://kebe.com/~danmcd/webrevs/13902/webrev-prealloc/ has both a fix for #13902 that encapsulates kernel_fpu_begin()'s installctx inside a kpreempt_disable() section ALONG WITH a the above proposal for installctx/installctx_preallocate(). It has been under test in a 13902-generating scenario with no failures thus far.
Updated by Electric Monk 4 months ago
- Status changed from Pending RTI to Closed
- % Done changed from 90 to 100
commit c21bd51d7acbaf77116c4cc3a23dfc6d16c637c2 Author: Dan McDonald <firstname.lastname@example.org> Date: 2021-07-02T19:24:36.000Z 13902 Fix for 13717 may break 8-disk raidz2 13915 installctx() blocking allocate causes problems Portions contributed by: Jerry Jelinek <email@example.com> Reviewed by: Garrett D'Amore <firstname.lastname@example.org> Reviewed by: Toomas Soome <email@example.com> Reviewed by: Patrick Mooney <firstname.lastname@example.org> Approved by: Joshua M. Clulow <email@example.com>