bhyve apicv logic could take more care
While scrutinizing consumers of the bhyve VM shim (vmspace, pmap, etc), my attention was drawn to how the VMX logic handles changes in APICv state during an instance lifetime. In particular, the transition from xAPIC to x2APIC mode, where access to the MMIO range is cut off and the x2APIC MSRs are "enabled". Currently, bhyve is both disabling the APIC MMIO access control (
PROCBASED2_VIRTUALIZE_APIC_ACCESSES) as well as removing the MMIO mapping in the EPT tables for the APIC access page. With those done, the x2APIC MSRs are enabled in the MSR bitmap, along with the
PROCBASED2_VIRTUALIZE_X2APIC_MODE control to activate the x2apic acceleration bits. One thing to note is that the unmapping of the APIC access page and the modification of the MSR bitmap are VM-wide operation (the vmspace and the bitmap apply to the whole guest). The former could cause potential issues if any other vCPUs attempt to access the APIC via MMIO, and the latter is literally called out as prohibited in the SDM:
Certain fields in the VMCS point to external data structures (for example: the MSR bitmap, the I/O bitmaps). If a logical processor is in VMX non-root operation, none of the external structures referenced by that logical processor's current VMCS should be modified by any logical processor or DMA. Before updating one of these structures, the VMM must ensure that no logical processor whose current VMCS references the structure is in VMX non-root operation.
Other VMMs have common solutions to both of these issues:
1. Leave the APIC access page mapped for the life of the VM. Disabling the
VIRTUALIZE_APIC_ACCESSES control on a per-vcpu basis will be adequate to prevent APIC MMIOs from being processed normally. It's true that MMIO reads/writes would then land on that physical page of memory, but the SDM notes that with x2APIC enabled the MMIO interface would show "Behavior identical to xAPIC in globally disabled state". This is not a corner of the architecture we expect guests to explore.
2. Use a per-vCPU MSR bitmap. The x2APIC state can that be individually set on each vCPU as it makes those transitions and the MSR bitmap will never be altered while the associated VMCS used to run that vCPU.
It should be noted that booting bhyve guests with the x2APIC enabled (using the
-x flag) does not appear to be successful today. This is clearly an area of the software which has not received much scrutiny, so functional expectations are low.