Project

General

Profile

Bug #12755

Double fault when booting under Amazon EC2

Added by Andrew Stormont 5 months ago. Updated 2 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

gcpu_init_ident_intc tries to enable PPIN and assumes that it works. Other platforms double check that. The subsequent attempt to read the PPIN causes things to blow up.


Files

panic1.png (499 KB) panic1.png Backtrace Andrew Stormont, 2020-05-19 08:04 PM
#1

Updated by Robert Mustacchi 5 months ago

Can you share which of the MSRs it's emulating or not emulating as expected there? I would have expected them not to emulate any of them and behave closer to qemu. I assume this is in a nitro specific instance?

#2

Updated by Andrew Stormont 5 months ago

The issue is that it advertises support for the PPIN and shows it as unlocked, but attempts to enable it don't stick. This is not specific to Nitro (I've not tested Nitro).

#3

Updated by Robert Mustacchi 5 months ago

OK. I'd be curious to understand more as the on_fault/no_fault logic in the CMI MSR code should cause us to return with no value, in that case, but clearly something's going wrong. If you have more details to share, I'd appreciate it as then I can potentially prototype a fix.

#4

Updated by Andrew Stormont 5 months ago

  • Description updated (diff)
#5

Updated by Andrew Stormont 5 months ago

Here is the fix that we've been using: https://code.illumos.org/c/illumos-gate/+/679

#6

Updated by Andrew Stormont 5 months ago

#7

Updated by Robert Mustacchi 5 months ago

So I agree double checking things is useful and in the face of a bunch of hypervisor games we should do anyways. But we do want to figure out why the on_fault/no_fault isn't working. Because the way this should work is that the rdmsr should be checked and eventually indicate that the cmi operation should fail which would return NULL.

#8

Updated by Electric Monk 4 months ago

  • Gerrit CR set to 679
#9

Updated by Andrew Stormont 3 months ago

This has been submitted for integration.

#10

Updated by Electric Monk 2 months ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit b445c7c6f2f09c2296534b5ccda2c05321c474b1

commit  b445c7c6f2f09c2296534b5ccda2c05321c474b1
Author: Andrew Stormont <astormont@racktopsystems.com>
Date:   2020-08-17T14:33:49.000Z

    12755 Double fault when booting under Amazon EC2
    Reviewed by: Robert Mustacchi <rm@fingolfin.org>
    Reviewed by: Toomas Soome <tsoome@me.com>
    Reviewed by: Patrick Mooney <pmooney@pfmooney.com>
    Approved by: Dan McDonald <danmcd@joyent.com>

Also available in: Atom PDF