Project

General

Profile

Actions

Bug #14664

closed

bhyve missing triple-fault handling for VMX

Added by Patrick Mooney 5 months ago. Updated 5 months ago.

Status:
Closed
Priority:
Normal
Category:
bhyve
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:
External Bug:

Description

When running the Windows kernel debugger under propolis, we tripped over an unhandled VMX exit on Intel hardware:

thread 'vcpu-0' panicked at 'VMX error: VmxDetail { status: 0, exit_reason: 2, exit_qualification: 0, inst_type: 0, inst_error: 0 }', propolis/src/lib.rs:133:17

That corresponds to the triple-fault exit reason, which after a trip to the relevant logic in vmx_exit_process, we're lacking a handler for. It should be translated into a vm-suspend of type triple-fault.

Actions #1

Updated by Patrick Mooney 5 months ago

After whipping up a little unit test for this, which runs clean on AMD, I can see the same problematic behavior on Intel:

Unexpected VMX exit:
        %rip: 800010
        status: 0
        reason: 2
        qualification: 0
        inst_type: 0
        inst_error: 0
FAIL triple_fault

Where a ud2a executed with an otherwise empty IDT causes a VMX exit, rather than vm-suspend (with how = triple-fault).

Actions #2

Updated by Patrick Mooney 5 months ago

With the proposed fix applied, the new test passes, along with the rest:

Test: /opt/bhyve-tests/tests/mevent/vnode_zvol (run as root)      [00:02] [PASS]
Test: /opt/bhyve-tests/tests/inst_emul/rdmsr (run as root)        [00:00] [PASS]
Test: /opt/bhyve-tests/tests/inst_emul/wrmsr (run as root)        [00:00] [PASS]
Test: /opt/bhyve-tests/tests/kdev/vatpit_freq (run as root)       [00:00] [PASS]
Test: /opt/bhyve-tests/tests/kdev/vhpet_freq (run as root)        [00:00] [PASS]
Test: /opt/bhyve-tests/tests/kdev/vlapic_freq (run as root)       [00:00] [PASS]
Test: /opt/bhyve-tests/tests/kdev/vlapic_freq_periodic (run as root) [00:00] [PASS]
Test: /opt/bhyve-tests/tests/kdev/vlapic_mmio_access (run as root) [00:00] [PASS]
Test: /opt/bhyve-tests/tests/kdev/vlapic_msr_access (run as root) [00:00] [PASS]
Test: /opt/bhyve-tests/tests/kdev/vpmtmr_freq (run as root)       [00:00] [PASS]
Test: /opt/bhyve-tests/tests/mevent/lists_delete (run as root)    [00:00] [PASS]
Test: /opt/bhyve-tests/tests/mevent/read_disable (run as root)    [00:00] [PASS]
Test: /opt/bhyve-tests/tests/mevent/read_pause (run as root)      [00:00] [PASS]
Test: /opt/bhyve-tests/tests/mevent/read_requeue (run as root)    [00:00] [PASS]
Test: /opt/bhyve-tests/tests/mevent/vnode_file (run as root)      [00:09] [PASS]
Test: /opt/bhyve-tests/tests/vmm/fpu_getset (run as root)         [00:00] [PASS]
Test: /opt/bhyve-tests/tests/vmm/mem_devmem (run as root)         [00:00] [PASS]
Test: /opt/bhyve-tests/tests/vmm/mem_partial (run as root)        [00:00] [PASS]
Test: /opt/bhyve-tests/tests/vmm/mem_seg_map (run as root)        [00:00] [PASS]

Results Summary
PASS      19

Running Time:   00:00:14
Percent passed: 100.0%
Log directory:  /var/tmp/test_results/20220430T044518

Actions #3

Updated by Electric Monk 5 months ago

  • Gerrit CR set to 2132
Actions #4

Updated by Patrick Mooney 5 months ago

With the updated default runfile including the triple_fault test, I ran the whole suite on both AMD and Intel gear, where it passed on both:

Test: /opt/bhyve-tests/tests/mevent/vnode_zvol (run as root)      [00:02] [PASS]
Test: /opt/bhyve-tests/tests/inst_emul/rdmsr (run as root)        [00:00] [PASS]
Test: /opt/bhyve-tests/tests/inst_emul/wrmsr (run as root)        [00:00] [PASS]
Test: /opt/bhyve-tests/tests/inst_emul/triple_fault (run as root) [00:00] [PASS]
Test: /opt/bhyve-tests/tests/kdev/vatpit_freq (run as root)       [00:00] [PASS]
Test: /opt/bhyve-tests/tests/kdev/vhpet_freq (run as root)        [00:00] [PASS]
Test: /opt/bhyve-tests/tests/kdev/vlapic_freq (run as root)       [00:00] [PASS]
Test: /opt/bhyve-tests/tests/kdev/vlapic_freq_periodic (run as root) [00:00] [PASS]
Test: /opt/bhyve-tests/tests/kdev/vlapic_mmio_access (run as root) [00:00] [PASS]
Test: /opt/bhyve-tests/tests/kdev/vlapic_msr_access (run as root) [00:00] [PASS]
Test: /opt/bhyve-tests/tests/kdev/vpmtmr_freq (run as root)       [00:00] [PASS]
Test: /opt/bhyve-tests/tests/mevent/lists_delete (run as root)    [00:00] [PASS]
Test: /opt/bhyve-tests/tests/mevent/read_disable (run as root)    [00:00] [PASS]
Test: /opt/bhyve-tests/tests/mevent/read_pause (run as root)      [00:00] [PASS]
Test: /opt/bhyve-tests/tests/mevent/read_requeue (run as root)    [00:00] [PASS]
Test: /opt/bhyve-tests/tests/mevent/vnode_file (run as root)      [00:09] [PASS]
Test: /opt/bhyve-tests/tests/vmm/fpu_getset (run as root)         [00:00] [PASS]
Test: /opt/bhyve-tests/tests/vmm/interface_version (run as root)  [00:00] [PASS]
Test: /opt/bhyve-tests/tests/vmm/mem_devmem (run as root)         [00:00] [PASS]
Test: /opt/bhyve-tests/tests/vmm/mem_partial (run as root)        [00:00] [PASS]
Test: /opt/bhyve-tests/tests/vmm/mem_seg_map (run as root)        [00:00] [PASS]

Results Summary
PASS      21

Running Time:   00:00:13
Percent passed: 100.0%
Log directory:  /var/tmp/test_results/20220503T160529

In addition to that I smoke-tested running a full guest to make sure there weren't any glaring regressions there. It all seemed fine.

Actions #5

Updated by Electric Monk 5 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

git commit 83b49c54d9c0766e810b6c8ff849dfb6693fc68a

commit  83b49c54d9c0766e810b6c8ff849dfb6693fc68a
Author: Patrick Mooney <pmooney@pfmooney.com>
Date:   2022-05-03T20:23:50.000Z

    14664 bhyve missing triple-fault handling for VMX
    Reviewed by: Luqman Aden <luqman@oxide.computer>
    Reviewed by: Andy Fiddaman <andy@omnios.org>
    Approved by: Dan McDonald <danmcd@mnx.io>

Actions

Also available in: Atom PDF