Support #9663

No illumos distro will boot

Added by LarBob N 5 months ago. Updated 5 months ago.

Status:NewStart date:2018-07-18
Priority:HighDue date:
Assignee:-% Done:

0%

Category:-
Target version:-
Tags:needs-triage

Description

I have an MSI H110i Pro motherboard that has a fully updated UEFI/BIOS. The CPU I'm using is a Celeron G3900 (Skylake).
Any illumos distro hangs on the SunOS Release screen and never boots. When booting with -v, it gets to
ramdisk0 at root
ramdisk0 is /ramdisk
root on /ramdisk:a fstype ufs

The issue seems similar to this:
[[https://github.com/joyent/smartos-live/issues/727]]
However, a good solution seems to never have been provided.

Thanks.

History

#1 Updated by Jason King 5 months ago

I've been troubleshooting a similar early boot hang on SmartOS (though all illumos distributions are likely impacted). While there are no guarantees the issue I'm working on is the same issue, I have SmartOS test images (i.e. these aren't an official release) available for testing of the likely fix of my issue at if you'd like to give it a try at:

https://us-east.manta.joyent.com/jbk/public/OS-7079/platform-20180719T001516Z.iso
https://us-east.manta.joyent.com/jbk/public/OS-7079/platform-20180719T001516Z.usb.bz2

One note -- these are set to enter KMDB upon NMI (the SmartOS default is to panic on NMI, however since the hang I've been troubleshooting happens before a dump device is configured, that doesn't help much). It's sufficient if the installer starts up -- there is no need to actually install SmartOS if not desired. My system (a Supermicro 5028D-TN4T) is able to trip this hang pretty easily, though others seem less likely to.

I'm still testing this out on various platforms to verify the proposed fix doesn't introduce any regressions on other systems (after which it'll get integrated into SmartOS and then upstreamed into illumos-gate so other distributions can benefit from the fix). If you do try it out, please let me know the results one way or another.,

#2 Updated by LarBob N 5 months ago

Jason King wrote:

I've been troubleshooting a similar early boot hang on SmartOS (though all illumos distributions are likely impacted). While there are no guarantees the issue I'm working on is the same issue, I have SmartOS test images (i.e. these aren't an official release) available for testing of the likely fix of my issue at if you'd like to give it a try at:

https://us-east.manta.joyent.com/jbk/public/OS-7079/platform-20180719T001516Z.iso
https://us-east.manta.joyent.com/jbk/public/OS-7079/platform-20180719T001516Z.usb.bz2

One note -- these are set to enter KMDB upon NMI (the SmartOS default is to panic on NMI, however since the hang I've been troubleshooting happens before a dump device is configured, that doesn't help much). It's sufficient if the installer starts up -- there is no need to actually install SmartOS if not desired. My system (a Supermicro 5028D-TN4T) is able to trip this hang pretty easily, though others seem less likely to.

I'm still testing this out on various platforms to verify the proposed fix doesn't introduce any regressions on other systems (after which it'll get integrated into SmartOS and then upstreamed into illumos-gate so other distributions can benefit from the fix). If you do try it out, please let me know the results one way or another.,

Same issue persists. Tested with a USB flash drive and a SATA optical drive.

#3 Updated by Jason King 5 months ago

Drat :(

Depending on how interested in debugging this -- you could see if booting with KMDB (add '-k' to the boot arguments at the grub prompt), and see if pressing F1-A drops you into KMDB after the system hangs. If so, that might allow for the possibility of doing some investigation into what the system is getting hung on (if it does, I can see about coming up with some commands to try).

#4 Updated by LarBob N 5 months ago

Jason King wrote:

Drat :(

Depending on how interested in debugging this -- you could see if booting with KMDB (add '-k' to the boot arguments at the grub prompt), and see if pressing F1-A drops you into KMDB after the system hangs. If so, that might allow for the possibility of doing some investigation into what the system is getting hung on (if it does, I can see about coming up with some commands to try).

That won't work as the keyboard doesn't even get initialized so I won't be able to drop into KMDB.

#5 Updated by LarBob N 5 months ago

I should say reinitialized. Of course the keyboard is working before the SunOS Release screen, but the it turns off.

#6 Updated by LarBob N 5 months ago

Alright, so I got the keyboard to stay on by enabling Windows 7 Installer in the BIOS but it even with KMDB + verbose I can't drop into KMDB. Pressing F1+A does nothing, ESC + B, Ctrl+Pause all do nothing.

#7 Updated by Robert Mustacchi 5 months ago

We've spent some time root causing at least the class of hangs that we're seeing on that smartos-live issue. With luck it will be the same class of ACPI related issue. The most helpful way to probably tell is at the initial kmdb prompt to change moddebug to 0x40000000. There are some other ways to tell, but unfortunately those require a modified image. Based on that value, we always saw that we had tried, but failed, to load an ACPI table (which is expected), but seeing will likely suggest that you're on a similar path / case.

#8 Updated by LarBob N 5 months ago

Robert Mustacchi wrote:

We've spent some time root causing at least the class of hangs that we're seeing on that smartos-live issue. With luck it will be the same class of ACPI related issue. The most helpful way to probably tell is at the initial kmdb prompt to change moddebug to 0x40000000. There are some other ways to tell, but unfortunately those require a modified image. Based on that value, we always saw that we had tried, but failed, to load an ACPI table (which is expected), but seeing will likely suggest that you're on a similar path / case.

I can't load KMDB as the keyboard gets uninitialized. If I load KMDB with -kd then I can't exit it because you can't unload boot-loaded kmdb (or at least, that's what it tells me).

Also available in: Atom