Bug #13253
closedbhyve stumbles on APIC ICRLO
100%
Description
While testing an experimental bootrom inside a bhyve VM, I observed a panic:
panic[cpu15]/thread=fffffe59e7f3c460: bad pending MMIO state fffffe007b721560 vmm:vmm_zsd_key+37dddca1 () fffffe007b7215d0 vmm:vm_handle_mmio_emul+150 () fffffe007b721690 vmm:vm_run+43a () fffffe007b721b40 vmm:vmmdev_do_ioctl+1173 () fffffe007b721cb0 vmm:vmm_ioctl+17c () fffffe007b721cf0 genunix:cdev_ioctl+2b () fffffe007b721d40 specfs:spec_ioctl+45 () fffffe007b721dd0 genunix:fop_ioctl+5b () fffffe007b721ef0 genunix:ioctl+153 () fffffe007b721f00 unix:brand_sys_syscall+304 ()
This was due to blowing the assert in vie_exitinfo
:
void vie_exitinfo(const struct vie *vie, struct vm_exit *vme) { if (vie->status & VIES_MMIO) { vme->exitcode = VM_EXITCODE_MMIO; if (vie->mmio_req_read.state == VR_PENDING) { vme->u.mmio.gpa = vie->mmio_req_read.gpa; vme->u.mmio.data = 0; vme->u.mmio.bytes = vie->mmio_req_read.bytes; vme->u.mmio.read = 1; } else if (vie->mmio_req_write.state == VR_PENDING) { vme->u.mmio.gpa = vie->mmio_req_write.gpa; vme->u.mmio.data = vie->mmio_req_write.data & vie_size2mask(vie->mmio_req_write.bytes); vme->u.mmio.bytes = vie->mmio_req_write.bytes; vme->u.mmio.read = 0; } else { panic("bad pending MMIO state"); }
Further investigation into the dump showed that the vCPU was writing 0xc4500
to the ICRLO
register on its APIC. This corresponds to an INIT IPI using the destination shorthand of all-excluding-self. The existing ICRLO logic does not handle that, and indicates a return-to-userspace via a negative return code. The instruction emulation logic propagates this value until the vm_handle_mmio_emul
receives it and attempts to populate exitinfo as if the instruction emulation required an MMIO exit. While it's true that the existing bhyve kernel logic cannot handle such a condition today, the result of a panic is not ideal. The instruction emulation logic should be updated to properly handle return-to-userspace results from MMIO (and inout) handlers, turning them into VM_EXITCODE_INST_EMUL
exits.
Separately, the vLAPIC handling for IPIs should be improved. This is true both for destination shorthand and the various INIT/SIPI aspects.
Related issues
Updated by Patrick Mooney almost 3 years ago
- Related to Bug #13030: remove 'retu' pattern from bhyve added
Updated by Patrick Mooney almost 3 years ago
Testing with the proposed patch, I can confirm that the previously fatal workload now results in a INST_EMUL
vm exit. Further attempts to drive the instance forward (without clearing the instruction emulation context) re-emits the INST_EMUL
exit, as expected.
Updated by Patrick Mooney almost 3 years ago
As a verification, I also ran the normal suite of test guests (Windows, OpenBSD, FreeBSD, Linux, OmniOS) to confirm that this change did not modify their behavior. To no surprise, all booted and ran normally.
Updated by Patrick Mooney almost 3 years ago
Mike Zeller successfully tested this (along with a few other bhyve patches) under SmartOS on an AMD Epyc machine.
Updated by Electric Monk almost 3 years ago
- Status changed from In Progress to Closed
- % Done changed from 0 to 100
git commit 06524cf4e4f6c18e557fb244d42a8e47dba3b1b6
commit 06524cf4e4f6c18e557fb244d42a8e47dba3b1b6 Author: Patrick Mooney <pmooney@pfmooney.com> Date: 2020-10-29T22:08:40.000Z 13253 bhyve stumbles on APIC ICRLO Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: Andy Fiddaman <andy@omniosce.org> Approved by: Robert Mustacchi <rm@fingolfin.org>