bhyve should shadow %cr0 on AMD
While testing new versions of the uefi-edk2 bootrom under bhyve, folks have noted that it runs very slowly during boot. After transferring control to the OS, the guest appears to run without any performance impact. When testing the same ROM on an Intel system, there is no such slowness. Tracing VM exits did not yield any interesting differences between the two systems. Tracing with the
profile- probe to sample the guest
%rip showed it spending quite a bit of time (15 seconds on an Epyc machine) in the LZMA decompression step of boot-up. After much digging, a culprit was found: The bootrom was setting
%cr0.CD, disabling the cache on the CPU. This is not a problem on Intel because the
NW) bits are ignored in the virtual
%cr0. It turns out our emulation for AMD is too accurate, and guests expect
CD to be a no-op. A further survey of other hypervisors (KVM, VirtualBox, Xen) shows this to be the case.
As a fix, we should perform
%cr0 shadowing, similar to how VMX does it. SVM offers some functionality to avoid exits when toggling often-accessed bits like
TS. By masking
NW) from the "real" value used by the CPU in guest contest, those newer roms and any other code which assumes cache-disable to be a no-op should function as expected, rather than performing like a CPU from the 90s.
Updated by Patrick Mooney 5 months ago
I've booted the typical battery of guest OSes (Linux, OpenBSD, Windows, NetBSD, illumos) on an AMD machine featuring this change. While booting, I traced
svm_set_cr0 using dtrace to observe changes in the
%cr0 state of those guests. In those traces, I saw the expected
%cr0 transitions, like when the shadowed bits were set or cleared, or when any other bits were changed when the shadowed bits were set (since it's in those instances when all changes must be tracked). The guests booted and operated normally. #13338 describes how I checked CLTS behavior as well.
Additionally, I've been using this patch along with the rust-based bhyve userspace during development, since
%cr0 shadowing is necessary for performance when booting with a stock OVMF rom (which sets
CR0_CD for reasons unknown).
Updated by Electric Monk 5 months ago
- Status changed from In Progress to Closed
- % Done changed from 0 to 100
commit 7db0d1931e7f4e135600dcbe0f4c5b10c732181e Author: Patrick Mooney <email@example.com> Date: 2021-03-04T21:54:38.000Z 13256 bhyve should shadow %cr0 on AMD 13338 bhyve should be able to emulate CLTS Reviewed by: Toomas Soome <firstname.lastname@example.org> Reviewed by: Joshua M. Clulow <email@example.com> Reviewed by: Andy Fiddaman <firstname.lastname@example.org> Approved by: Richard Lowe <email@example.com>