Project

General

Profile

Actions

Bug #13132

closed

VMX event injection can race in bhyve

Added by Patrick Mooney 11 months ago. Updated 8 months ago.

Status:
Closed
Priority:
Normal
Category:
bhyve
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
bhyve
Gerrit CR:

Description

As part of porting bhyve to illumos, all of its mutex usage was switched to using adaptive mutexes rather than spinlocks. The latter were used in FreeBSD due to OS constraints around sleeping inside critical_enter()/critical_exit() regions. No such limitation was present in illumos, primarily due to the fact that we had to use ctxops as extra protection inside those regions (implemented with kpreempt_disable/kpreempt_enable instead), since we could not guarantee an absence of operations causing lock-related sleep/swtch activity. As a result of the locks being converted to adaptive, VMX exception/interrupt injection was moved from a region where interrupts were disabled, to one where they remained enabled. This is because certain exception/interrupt code paths required lock activity to completed. The FreeBSD code was depending on the host vCPU doing this injection to have interrupts disabled, since external notifications (IPIs) for new events would be queued, causing an immediate VM exit once it attempted entry into guest context. Any injection done outside of this interrupts-disabled guard was at risk of taking the notification, processing the effectively-noop IPI, and then continuing into VM entry without knowing an event was pending. The delay on this event would then depend on how long the CPU spent in guest context before reevaluating the events available for injections.

This has not been a reported (in the wild) problem so far, probably due to the relative prevalence of APICv support among the machines running illumos-bhyve. That's because APICv event injection is performed without locks, and is therefore done after host CPU interrupts are disabled, safe from the potential race. Only events such as NMIs or ExtINT (source from the AT-PIC or IOAPIC) would be caught by this race, given the current codebase. Certain restructuring for #13007 could have exacerbated it by changing how/when APIC interrupts were injected.

The fix for this, while perhaps a bit involved, should be relatively straightforward. The existing NMI/ExtINT event injection can remain outside the protection of having interrupts disabled on the host CPU. Once interrupts are disabled, however, we must re-check for NMI/ExtINT events which may have fallen into that race window. If found, we can take another trip through the event injection logic.


Related issues

Related to illumos gate - Bug #13133: SVM event injection can race in bhyveDuplicatePatrick Mooney

Actions
Related to illumos gate - Bug #13259: SVM event injection can race in bhyveClosedPatrick Mooney

Actions
Related to illumos gate - Bug #13007: bhyve vlapic should set TMR on intr acceptClosedPatrick Mooney

Actions
Actions #1

Updated by Patrick Mooney 11 months ago

  • Description updated (diff)
Actions #2

Updated by Patrick Mooney 11 months ago

  • Related to Bug #13133: SVM event injection can race in bhyve added
Actions #3

Updated by Electric Monk 11 months ago

  • Gerrit CR set to 835
Actions #4

Updated by Patrick Mooney 9 months ago

  • Related to Bug #13259: SVM event injection can race in bhyve added
Actions #5

Updated by Patrick Mooney 8 months ago

As noted in the description, instances of this race causing a problem have not been observed in the wild or during testing, for that matter. It was discovered while overhauling the event injection system for #13007. Testing to verify that injection still function as before is covered under that ticket.

Actions #6

Updated by Patrick Mooney 8 months ago

  • Related to Bug #13007: bhyve vlapic should set TMR on intr accept added
Actions #7

Updated by Electric Monk 8 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

git commit c74a40a584c9d875009f725565896fd7e8ee38d6

commit  c74a40a584c9d875009f725565896fd7e8ee38d6
Author: Patrick Mooney <pmooney@pfmooney.com>
Date:   2020-11-24T20:30:25.000Z

    13007 bhyve vlapic should set TMR on intr accept
    13106 clarify PPR transitions in bhyve vLAPIC
    13132 VMX event injection can race in bhyve
    13259 SVM event injection can race in bhyve
    Reviewed by: Robert Mustacchi <rm@fingolfin.org>
    Reviewed by: Toomas Soome <tsoome@me.com>
    Approved by: Dan McDonald <danmcd@joyent.com>

Actions

Also available in: Atom PDF