Project

General

Profile

Actions

Bug #13915

closed

installctx() blocking allocate causes problems

Added by Dan McDonald 4 months ago. Updated 4 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
kernel
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

Some callers of installctx() call it while inside a kpreempt_disable()/kpreempt_enable() block. This often has to do with needing to NOT have the current thread swap out until ALL of the work related to installctx() is done.

The problem is: installctx() has two portions:

1.) SLEEPING/BLOCKING allocation of "struct ctxop".
2.) Installation of the allocated ctxop, which because of #12478 is encapsulated in a kpreempt_disable/kpreempt_enable block.

Some callers MUST install the context, perform additional state changes, and ONLY THEN allow preemption to resume. An example of this case is #13902, but there are other callers that also wrap a call to installctx() inside a kpreempt_disable/kpreempt_enable block. The problem with those other callers is that if part 1 of installctx (the blocking allocation) actually blocks, then the thread locks or some other assertion is tripped.

So if we have an installctx() caller that is not in a kpreempt_disable section, we must behave as we did before lest we break #12478. If we have an installctx() caller that, for whatever reason, MUST itself be in a kpreempt_disable block that includes an installctx() call, we must NOT BLOCK, and therefore must find a different way to allocate "struct ctxop".

While finding solutions to #13902, I've proposed a new call: installctx_preallocate(), which MUST be called from outside any kpreempt_disable() block, but allows callers of installctx() which must have kernel preemption to allocate, then disable preemption. It also adds another parameter to installctx(), which is NULL if we want installctx() to allocate, or non-NULL if we want installctx to take a buffer from installctx_preallocate.

The webrev https://kebe.com/~danmcd/webrevs/13902/webrev-prealloc/ has both a fix for #13902 that encapsulates kernel_fpu_begin()'s installctx inside a kpreempt_disable() section ALONG WITH a the above proposal for installctx/installctx_preallocate(). It has been under test in a 13902-generating scenario with no failures thus far.


Related issues

Related to illumos gate - Bug #13902: Fix for 13717 may break 8-disk raidz2ClosedDan McDonald

Actions
Related to illumos gate - Feature #12478: installctx needs kpreempt_disable protectionClosedPatrick Mooney

Actions
Related to illumos gate - Bug #13917: ctxops interfaces could be more ergonomicIn ProgressPatrick Mooney

Actions
Actions #1

Updated by Dan McDonald 4 months ago

  • Category set to kernel
  • Assignee set to Dan McDonald
Actions #2

Updated by Dan McDonald 4 months ago

  • Related to Bug #13902: Fix for 13717 may break 8-disk raidz2 added
Actions #3

Updated by Dan McDonald 4 months ago

  • Related to Feature #12478: installctx needs kpreempt_disable protection added
Actions #4

Updated by Electric Monk 4 months ago

  • Gerrit CR set to 1579
Actions #5

Updated by Dan McDonald 4 months ago

A reminder - testing has been done alongside #13902 - with particular attention paid to kernel_fpu_begin().

Actions #6

Updated by Dan McDonald 4 months ago

  • Status changed from New to Pending RTI
  • % Done changed from 0 to 90
Actions #7

Updated by Patrick Mooney 4 months ago

  • Related to Bug #13917: ctxops interfaces could be more ergonomic added
Actions #8

Updated by Electric Monk 4 months ago

  • Status changed from Pending RTI to Closed
  • % Done changed from 90 to 100

git commit c21bd51d7acbaf77116c4cc3a23dfc6d16c637c2

commit  c21bd51d7acbaf77116c4cc3a23dfc6d16c637c2
Author: Dan McDonald <danmcd@joyent.com>
Date:   2021-07-02T19:24:36.000Z

    13902 Fix for 13717 may break 8-disk raidz2
    13915 installctx() blocking allocate causes problems
    Portions contributed by: Jerry Jelinek <gjelinek@gmail.com>
    Reviewed by: Garrett D'Amore <garrett@damore.org>
    Reviewed by: Toomas Soome <tsoome@me.com>
    Reviewed by: Patrick Mooney <pmooney@pfmooney.com>
    Approved by: Joshua M. Clulow <josh@sysmgr.org>

Actions

Also available in: Atom PDF