Project

General

Profile

Actions

Bug #13917

closed

ctxops interfaces could be more ergonomic

Added by Patrick Mooney 11 months ago. Updated 4 months ago.

Status:
Closed
Priority:
Normal
Category:
kernel
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

With the integration of #13915, the argument lists for installctx and removectx have continued to grow even more unwieldy. What's more, those interfaces are consumed by external drivers (KVM and VirtualBox), despite being outside the scope of the DDI. This means a flag day for those external drivers when making any meaningful changes to the ctxops interface. I propose several changes to the ctxops interface to address some of these challenges:

First, use ctxop_ prefixed names for functions consumed by outside logic. This would exclude kernel-internal bits like savectx and restorectx (called by swtch), but would cover installctx, removectx, and installctx_preallocate.

Just changing the function names is not enough to help with the argument soup problem, though. To deal with that, it should be switched to use template struct for configuring the functions associated with a ctxop (similar to struct dev_ops). This will help in a few ways: C99 struct initialization allows us to omit unused handlers. The template struct can bear a revision (again, like devo_rev) so that we can have better tools to handle updates without the need for a flag day.

The final change as part of the ergonomic overhaul has to do with ctxop allocation and freeing. In #13915, the ability to preallocate a struct ctxop was added. To expand on this for bhyve, which attaches/detaches ctxops handlers for every VM_RUN call, we'll add an interface to detach a ctxop struct without implicitly freeing it (ctxop_detach). This will allow consumers like bhyve, who wish to reuse pre-allocated ctxop storage, to operate in the manner they desire.


Related issues

Related to illumos gate - Bug #13915: installctx() blocking allocate causes problemsClosedDan McDonald

Actions
Actions #1

Updated by Electric Monk 11 months ago

  • Gerrit CR set to 1582
Actions #2

Updated by Patrick Mooney 11 months ago

  • Related to Bug #13915: installctx() blocking allocate causes problems added
Actions #3

Updated by Patrick Mooney 11 months ago

Like #13915 before, this represents a flag day for out-of-gate ctxop consumers, namely KVM and VirtualBox.

Actions #4

Updated by Andy Fiddaman 11 months ago

A (draft at time of writing) pull request for Virtualbox 6 is at https://github.com/omniosorg/omnios-extra/pull/921

Actions #5

Updated by Patrick Mooney 4 months ago

Booting a machine with this platform and using dtrace, I can see the segreg, (userspace) FPU, and schedctl ctxops being allocated/attached and detached/deallocated to threads as processes come and go. Similarly, when a bhyve instance is started, the expected ctxop attach/detach operations are observed as the vCPU threads enter/exit guest context in the kernel. The stack ordering of ctxop execution required by bhyve (implemented in #12477 ) was checked as well:

 11  33425                 restorectx:entry
 11  15701        lwp_segregs_restore:entry
 11  23875           schedctl_restore:entry
 11  86417             vmm_restorectx:entry
 11  87144             svm_restorectx:entry
...
 11  32111                    savectx:entry
 11  86415                vmm_savectx:entry
 11  87142                svm_savectx:entry
 11  86441        save_guest_fpustate:entry
 11  86308                    fpusave:entry
 11  18348                 fp_restore:entry
 11  23873              schedctl_save:entry
 11  20530           lwp_segregs_save:entry

Actions #6

Updated by Patrick Mooney 4 months ago

To exercise the zfs kfpu paths, I created a test pool featuring a raidz vdev and copied data to it, again dtracing the involved ctxop functions. The pool operated normally and scrubbed clean after some test transfers.

After performing all of these tests, the node was forced to dump to check for ::findleaks and ::kmem_verify. Those results were clean.

Actions #7

Updated by Andy Fiddaman 4 months ago

I've tested this change on OmniOS along with a patched illumos-kvm1 and a patched Virtualbox2.
Under both hypervisors, I was able to successfully boot a Debian Linux guest, log in and shut then down again.

[1] https://github.com/omniosorg/illumos-kvm/pull/18
[2] https://github.com/omniosorg/omnios-extra/pull/921

I've opened a draft PR for OpenIndiana to update Virtualbox there - https://github.com/OpenIndiana/oi-userland/pull/7643
and I'll also update illumos-kvm as part of the same PR once the upstream (Joyent) repository is patched.

Actions #8

Updated by Dan McDonald 4 months ago

Additionally, to prevent merge hell I added changes to LX that consume this interface. My one LX zone on my test machine appears to behave no differently than prior to this change.

Actions #9

Updated by Electric Monk 4 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

git commit 5a469116729183a46e77dc0620955bbde58d93f7

commit  5a469116729183a46e77dc0620955bbde58d93f7
Author: Patrick Mooney <pmooney@pfmooney.com>
Date:   2022-01-27T18:19:34.000Z

    13917 ctxops interfaces could be more ergonomic
    Reviewed by: Dan McDonald <danmcd@joyent.com>
    Reviewed by: Andy Fiddaman <andy@omnios.org>
    Reviewed by: Toomas Soome <tsoome@me.com>
    Reviewed by: Dan Cross <cross@oxidecomputer.com>
    Approved by: Robert Mustacchi <rm@fingolfin.org>

Actions

Also available in: Atom PDF