Bug #13915
closedinstallctx() blocking allocate causes problems
100%
Description
Some callers of installctx() call it while inside a kpreempt_disable()/kpreempt_enable() block. This often has to do with needing to NOT have the current thread swap out until ALL of the work related to installctx() is done.
The problem is: installctx() has two portions:
1.) SLEEPING/BLOCKING allocation of "struct ctxop".
2.) Installation of the allocated ctxop, which because of #12478 is encapsulated in a kpreempt_disable/kpreempt_enable block.
Some callers MUST install the context, perform additional state changes, and ONLY THEN allow preemption to resume. An example of this case is #13902, but there are other callers that also wrap a call to installctx() inside a kpreempt_disable/kpreempt_enable block. The problem with those other callers is that if part 1 of installctx (the blocking allocation) actually blocks, then the thread locks or some other assertion is tripped.
So if we have an installctx() caller that is not in a kpreempt_disable section, we must behave as we did before lest we break #12478. If we have an installctx() caller that, for whatever reason, MUST itself be in a kpreempt_disable block that includes an installctx() call, we must NOT BLOCK, and therefore must find a different way to allocate "struct ctxop".
While finding solutions to #13902, I've proposed a new call: installctx_preallocate(), which MUST be called from outside any kpreempt_disable() block, but allows callers of installctx() which must have kernel preemption to allocate, then disable preemption. It also adds another parameter to installctx(), which is NULL if we want installctx() to allocate, or non-NULL if we want installctx to take a buffer from installctx_preallocate.
The webrev https://kebe.com/~danmcd/webrevs/13902/webrev-prealloc/ has both a fix for #13902 that encapsulates kernel_fpu_begin()'s installctx inside a kpreempt_disable() section ALONG WITH a the above proposal for installctx/installctx_preallocate(). It has been under test in a 13902-generating scenario with no failures thus far.
Related issues
Updated by Dan McDonald over 1 year ago
- Category set to kernel
- Assignee set to Dan McDonald
Updated by Dan McDonald over 1 year ago
- Related to Bug #13902: Fix for 13717 may break 8-disk raidz2 added
Updated by Dan McDonald over 1 year ago
- Related to Feature #12478: installctx needs kpreempt_disable protection added
Updated by Dan McDonald over 1 year ago
A reminder - testing has been done alongside #13902 - with particular attention paid to kernel_fpu_begin().
Updated by Dan McDonald over 1 year ago
- Status changed from New to Pending RTI
- % Done changed from 0 to 90
Updated by Patrick Mooney over 1 year ago
- Related to Bug #13917: ctxops interfaces could be more ergonomic added
Updated by Electric Monk over 1 year ago
- Status changed from Pending RTI to Closed
- % Done changed from 90 to 100
git commit c21bd51d7acbaf77116c4cc3a23dfc6d16c637c2
commit c21bd51d7acbaf77116c4cc3a23dfc6d16c637c2 Author: Dan McDonald <danmcd@joyent.com> Date: 2021-07-02T19:24:36.000Z 13902 Fix for 13717 may break 8-disk raidz2 13915 installctx() blocking allocate causes problems Portions contributed by: Jerry Jelinek <gjelinek@gmail.com> Reviewed by: Garrett D'Amore <garrett@damore.org> Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: Patrick Mooney <pmooney@pfmooney.com> Approved by: Joshua M. Clulow <josh@sysmgr.org>