Project

General

Profile

Actions

Bug #7574

closed

boot slowness followed by panic while booting on KVM when no cpu tag is specified in virsh XML

Added by Daniel Kimmel almost 6 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
driver - device drivers
Start date:
2016-11-11
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

When booting on kvm2 or kvm4, a DE that has no cpu tag takes a very long time to actually boot, and then panics once it comes up. The cpu tag is used to specify the feature set of the virtual processor presented to the VM. Some setting on the virtual processor is likely different than what we expect.


In cpuid.c we have:

1441 switch (cpi->cpi_vendor) {
1442 case X86_VENDOR_Intel:
1443 if (IS_NEW_F6(cpi) || cpi->cpi_family >= 0xf ||
1444 xcpuid++;
1445 break;

And the following defines:

420 #define IS_LEGACY_P6(cpi) (                     \
421 cpi->cpi_family 6 && \
422 (cpi->cpi_model 1 || \
423 cpi->cpi_model 3 || \
424 cpi->cpi_model 5 || \
425 cpi->cpi_model 6 || \
426 cpi->cpi_model 7 || \
427 cpi->cpi_model 8 || \
428 cpi->cpi_model 0xA || \
429 cpi->cpi_model 0xB) \
430 )
431
432 /* A "new F6" is everything with family 6 that's not the above */
433 #define IS_NEW_F6(cpi) ((cpi->cpi_family 6) && !IS_LEGACY_P6(cpi))

The processor presented to us has model and family 6, so it is being seen as a Legacy P6, so xcpuid is never incremented.
Later in cpuid_pass1 we hit the following code:

1469 if (xcpuid) {
1470 cp = &cpi->cpi_extd0;
1471 cp->cp_eax = 0x80000000;
1472 cpi->cpi_xmaxeax = __cpuid_insn(cp);
1473 }

Since xcpuid is 0, we never set xmaxeax, so the code we would expect to use to set the number of physical and virtual address bits is never executed.


Root caused to code that doesn't allow the extended cpuid passes to properly examine the processor. The fix checks if we are on KVM, and if so increments xcpuid anyway when on family and model 6 as we would be. I'll also entertain the notion that we should always increment xcpuid on KVM, but better safe than too permissive I think. This change allows cpuid passes 1-4 to correctly identify the processor information. Once the information is correct, the VA gap is in the right place, so boot_mapin's argument aligns with the gap, and the system is no longer slow.

Actions

Also available in: Atom PDF