Bug #13361
closedbhyve should mask RDT cpuid info
100%
Description
While bhyve currently masks the RDT_A bit ( Resource Director Technology) from leaf 0x7 EBX, it has no such filter for the detailed information in leaf 0x10. An OS which ignores the lack of RDT_A could still find seemingly actionable capabilities advertised in leaf 0x10 which it then tries to act on. I believe this is the cause of some dmesg noise seen on CentOS 8 running on newer (Rome) AMD CPUs under bhyve:
[ 1.122722] unchecked MSR access error: WRMSR to 0xc8f (tried to write 0x0000000000000000) at rIP: 0xffffffffa9c64f74 (native_write_msr+0x4/0x20) [ 1.125304] Call Trace: [ 1.125810] clear_closid_rmid.isra.4+0x32/0x40 [ 1.126740] resctrl_online_cpu+0xcd/0x4c0 [ 1.127574] ? __switch_to_asm+0x41/0x70 [ 1.128387] ? __switch_to_asm+0x35/0x70 [ 1.129207] ? __switch_to_asm+0x41/0x70 [ 1.130024] ? cat_wrmsr+0x60/0x60 [ 1.130710] ? sort_range+0x20/0x20 [ 1.131427] cpuhp_invoke_callback+0x8d/0x500 [ 1.132318] ? sort_range+0x20/0x20 [ 1.133047] cpuhp_thread_fun+0xb0/0x110 [ 1.133841] smpboot_thread_fn+0xc5/0x160 [ 1.134669] kthread+0x112/0x130 [ 1.135340] ? kthread_flush_work_fn+0x10/0x10 [ 1.136256] ret_from_fork+0x35/0x40 [ 1.137612] resctrl: L3 monitoring detected
MSR 0xc8f is IA32_PQR_ASSOC
, used for some of those resource partitioning tasks.
Since we are not going to give guests access to influence host cache allocation, the related capability bits in cpuid should be hidden.
Related issues
Updated by Patrick Mooney over 1 year ago
Linux also lists the RDT-related flags in cpuinfo on that system (prior to a fix): cat_l3 cdp_l3
Updated by Patrick Mooney over 1 year ago
- Subject changed from bhyve should mask RDT leaf to bhyve should mask RDT cpuid info
Updated by Patrick Mooney over 1 year ago
An initial test of the fix shows the guest no longer attempting to perform the wrmsr(IA32_PQR_ASSOC)
and dmesg is free of the aforementioned error.
Updated by Patrick Mooney over 1 year ago
My most modern Intel lab machine (Ivy Bridge) is too old to possess the features in question. Its CPUID leaves stop at 0xd
. All the same, I booted a few guests there to make sure there wasn't something horribly wrong.
On my Rome machine, where the RDT pieces are present, the fix prevented Linux from attempting unhandled access to the IA32_PQR_ASSOC
MSR. Additionally, the RDT-related capabilities (rdt_a, cat_l3, cdp_l3) were all now absent from /proc/cpuinfo in a linux guest. The other standard smoke-test guests ran without issues. (It's possible that only Linux was reaching for those RDT features as they were erroneously exposed)
Updated by Patrick Mooney over 1 year ago
- Related to Bug #13369: bhyve should mask PQoS bits from CPUID added
Updated by Electric Monk over 1 year ago
- Status changed from In Progress to Closed
- % Done changed from 0 to 100
git commit 1a5f1879be09d3de900b2510692dd12003784d84
commit 1a5f1879be09d3de900b2510692dd12003784d84 Author: Patrick Mooney <pmooney@pfmooney.com> Date: 2020-12-16T20:02:23.000Z 13361 bhyve should mask RDT cpuid info Reviewed by: Andy Fiddaman <andy@omnios.org> Reviewed by: Toomas Soome <tsoome@me.com> Approved by: Robert Mustacchi <rm@fingolfin.org>