Project

General

Profile

Actions

Feature #14024

closed

bhyve vm_suspend should be more flexible

Added by Patrick Mooney 3 months ago. Updated about 2 months ago.

Status:
Closed
Priority:
Normal
Category:
bhyve
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

The bhyve kernel component features vm_suspend in its API: A way to indicate that a VM is to progress towards a shutdown point, either for reset or halt (including those induced by triple-fault induced). This signals all vCPUs to exit from guest context, and mark a bitfield in struct vm that they have suspended execution. Once all active vCPUs have made a trip though vm_run to discover the requested suspend and set that appropriate bit, all the threads return with a VM_EXITCODE_SUSPENDED code. The VM_REINIT operation (used for clearing VM state to run it as if it was reset) requires that all active vCPUs be in this suspended state before it will execute. If a vCPU thread is in userspace at the time vm_suspend is issued, it must make a trip through vm_run before the reinit can proceed. Any vCPU that attempts a vm_run after it has been successfully suspended (but not yet reinitialized) will receive an error. This makes coordination of vCPU threads and operations attempting to prepare the instance for reset a little complicated.

It would not take a huge effort to make these interfaces easier to use:
1. When performing a vm_suspend, immediately mark any vCPUs which are not in guest context (or sleeping in a HLT) as suspended
2. Allow already-suspended vCPUs to call back into vm_run. If all vCPUs are suspended at that time, then VM_EXITCODE_SUSPENDED will be re-emitted for that vCPU
3. Provide a wait flag for vm_suspend, so the ioctl can block until all active vCPUs are marked as suspended.

This should give userspace the tools to more easily navigate the VM and its vCPU threads into a suspended state for halt or reinitialization.

Actions #1

Updated by Patrick Mooney 3 months ago

My thoughts about how to address the challenges of bhyve VM reinitialization have shifted a bit while testing proposed fixes for this issue. The logic for marking vCPUs suspended as soon as possible (for those not running in guest context) seemed to work well. The wait flag for suspend, however, did not seem as valuable. A simpler approach was to make the VM_REINIT ioctl itself capable of marking straggler vCPUs as suspended. Since VM_REINIT acquires a write-lock on the VM, which involves locking all the vCPUs in the FROZEN state, this after-the-fact marking of suspended is safe and easy. Adding a flags field to the VM_REINIT ioctl was the more natural fit.

Actions #2

Updated by Joshua M. Clulow 2 months ago

  • Gerrit CR set to 1660
Actions #3

Updated by Joshua M. Clulow 2 months ago

  • Description updated (diff)
Actions #4

Updated by Joshua M. Clulow 2 months ago

  • Description updated (diff)
Actions #5

Updated by Patrick Mooney 2 months ago

The proposed fix for this adds a VM_REINIT_F_FORCE_SUSPEND flag, available for use in VM_REINIT ioctl calls. There are currently no consumers of this in the gate, but propolis will use it immediately for driving instance resets.

Actions #6

Updated by Patrick Mooney 2 months ago

I ran all of the usual-suspects guests on a platform featuring this change. Each one successfully booted, and after initiating a reboot from inside the guest, successfully booted again after a reset. The same held true for an externally triggered "hard" reset using bhyvectl to issue the vm_suspend call. Additionally, a propolis instance (running a linux guest) went through those same boot/reboot and boot/hard-reset trials, navigating them successfully.

Actions #7

Updated by Patrick Mooney 2 months ago

Although bhyve(1) does not use the forced-reinit, and propolis is configured to quiesce its vCPU threads prior to a reinit (which happens to use the FORCE flag), I also wanted to test behavior of a userspace consumer which is running vCPU threads when the forced-reinit occurs. To do this, I ran a normal bhyve instance (sans viona, since the ioport hooks preclude reinit) and issued a manual ioctl(VM_REINIT) with the FORCE flag set. All of the vCPU threads in that instance were promptly booted out to userspace, hitting the (expected) EINVAL when attempting re-entry, since the now-reinitialized VM has no VM_ACTIVATE -ed vCPUs.

Actions #8

Updated by Electric Monk about 2 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

git commit 52fac30e3e977464254b44b1dfb4717fb8d2fbde

commit  52fac30e3e977464254b44b1dfb4717fb8d2fbde
Author: Patrick Mooney <pmooney@pfmooney.com>
Date:   2021-09-28T16:17:31.000Z

    14024 bhyve vm_suspend should be more flexible
    Reviewed by: Dan Cross <cross@oxidecomputer.com>
    Reviewed by: Luqman Aden <luqman@oxide.computer>
    Reviewed by: Joshua M. Clulow <josh@sysmgr.org>
    Approved by: Dan McDonald <danmcd@joyent.com>

Actions

Also available in: Atom PDF