Project

General

Profile

Bug #13253

bhyve stumbles on APIC ICRLO

Added by Patrick Mooney 6 months ago. Updated 6 months ago.

Status:
Closed
Priority:
Normal
Category:
bhyve
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
bhyve
Gerrit CR:

Description

While testing an experimental bootrom inside a bhyve VM, I observed a panic:

panic[cpu15]/thread=fffffe59e7f3c460:
bad pending MMIO state

fffffe007b721560 vmm:vmm_zsd_key+37dddca1 ()
fffffe007b7215d0 vmm:vm_handle_mmio_emul+150 ()
fffffe007b721690 vmm:vm_run+43a ()
fffffe007b721b40 vmm:vmmdev_do_ioctl+1173 ()
fffffe007b721cb0 vmm:vmm_ioctl+17c ()
fffffe007b721cf0 genunix:cdev_ioctl+2b ()
fffffe007b721d40 specfs:spec_ioctl+45 ()
fffffe007b721dd0 genunix:fop_ioctl+5b ()
fffffe007b721ef0 genunix:ioctl+153 ()
fffffe007b721f00 unix:brand_sys_syscall+304 ()

This was due to blowing the assert in vie_exitinfo:

void
vie_exitinfo(const struct vie *vie, struct vm_exit *vme)
{
        if (vie->status & VIES_MMIO) {
                vme->exitcode = VM_EXITCODE_MMIO;
                if (vie->mmio_req_read.state == VR_PENDING) {
                        vme->u.mmio.gpa = vie->mmio_req_read.gpa;
                        vme->u.mmio.data = 0;
                        vme->u.mmio.bytes = vie->mmio_req_read.bytes;
                        vme->u.mmio.read = 1;
                } else if (vie->mmio_req_write.state == VR_PENDING) {
                        vme->u.mmio.gpa = vie->mmio_req_write.gpa;
                        vme->u.mmio.data = vie->mmio_req_write.data &
                            vie_size2mask(vie->mmio_req_write.bytes);
                        vme->u.mmio.bytes = vie->mmio_req_write.bytes;
                        vme->u.mmio.read = 0;
                } else {
                        panic("bad pending MMIO state");
                }

Further investigation into the dump showed that the vCPU was writing 0xc4500 to the ICRLO register on its APIC. This corresponds to an INIT IPI using the destination shorthand of all-excluding-self. The existing ICRLO logic does not handle that, and indicates a return-to-userspace via a negative return code. The instruction emulation logic propagates this value until the vm_handle_mmio_emul receives it and attempts to populate exitinfo as if the instruction emulation required an MMIO exit. While it's true that the existing bhyve kernel logic cannot handle such a condition today, the result of a panic is not ideal. The instruction emulation logic should be updated to properly handle return-to-userspace results from MMIO (and inout) handlers, turning them into VM_EXITCODE_INST_EMUL exits.

Separately, the vLAPIC handling for IPIs should be improved. This is true both for destination shorthand and the various INIT/SIPI aspects.


Related issues

Related to illumos gate - Bug #13030: remove 'retu' pattern from bhyveClosedPatrick Mooney

Actions
#1

Updated by Patrick Mooney 6 months ago

  • Related to Bug #13030: remove 'retu' pattern from bhyve added
#2

Updated by Electric Monk 6 months ago

  • Gerrit CR set to 1016
#3

Updated by Patrick Mooney 6 months ago

Testing with the proposed patch, I can confirm that the previously fatal workload now results in a INST_EMUL vm exit. Further attempts to drive the instance forward (without clearing the instruction emulation context) re-emits the INST_EMUL exit, as expected.

#4

Updated by Patrick Mooney 6 months ago

As a verification, I also ran the normal suite of test guests (Windows, OpenBSD, FreeBSD, Linux, OmniOS) to confirm that this change did not modify their behavior. To no surprise, all booted and ran normally.

#5

Updated by Patrick Mooney 6 months ago

Mike Zeller successfully tested this (along with a few other bhyve patches) under SmartOS on an AMD Epyc machine.

#6

Updated by Electric Monk 6 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

git commit 06524cf4e4f6c18e557fb244d42a8e47dba3b1b6

commit  06524cf4e4f6c18e557fb244d42a8e47dba3b1b6
Author: Patrick Mooney <pmooney@pfmooney.com>
Date:   2020-10-29T22:08:40.000Z

    13253 bhyve stumbles on APIC ICRLO
    Reviewed by: Toomas Soome <tsoome@me.com>
    Reviewed by: Andy Fiddaman <andy@omniosce.org>
    Approved by: Robert Mustacchi <rm@fingolfin.org>

Also available in: Atom PDF