Bug #13059
Dell R630 - X2APIC enabled cause boot hang
0%
Description
After purchasing two R630 -- I've discovered that this server is non-bootable with illumos.
The error appears to be when X2APIC is enabled in BIOS; the error: No SOF interrupts have been received this usb ehci hostcontroller is unusable and then just hangs.
If I turn off X2APIC, the system boots but I then loose HyperThreading capabilities which is bearable but not optimal.
I am seeing the same symptoms when trying to boot into FreeBSD and variants.
Linux is the only successful OS and not a big fan of the OS, If there is anything I can do to assist and that I am happy to lend the hardware if required, I'm happy to do so.
All BIOS/Firmware are up to date.
ISO: omniosce-r151034l.iso
Files
Updated by Robert Mustacchi 7 months ago
Apologies for not seeing this earlier. Something that might be useful here is to boot with kmdb enabled and entering it on an NMI. There are a couple of odd things that we should take apart.
In general, hyper threading and the x2apic historically haven't been related. At least on other Dell RX30 systems we haven't seen that combination happen. In this case, we should probably focus on the x2apic hang first. What would be useful is to figure out if we can drop into the kernel debugger when this happens. There are two approaches to try. The first is to use the ipmi serial console and try and inject a break (usually ~b) or inject an nmi via something like ipmitool chassis power diag
. This generally require using loader to change the kmdb options.
Updated by Robert Mustacchi 7 months ago
OK, that's good to hear. I'd recommend that you try and do this over the Dell Serial over lan console if possible, as I've found that smoother than the dell vga console. Though it shouldn't matter which way inject the NMI.
Updated by Jürgen Bereuter 6 months ago
The error appears to be when X2APIC is enabled in BIOS; the error: No SOF interrupts have been received this usb ehci hostcontroller is unusable and then just hangs.
I am experiencing nearly the same behaviour on an hpe DL20 gen10 server. It does not matter, if X2APIC or HPET are disabled or not. The internal raid controller (E208i-a SR Gen10) shows no disks during the installation of omnios (bloody). I tried to use another hba in the pcie slot - there are the 2 disks, but after installation it wont boot - there are the SOF - warnings/errors in the log. The internal usb port is not recognised, when there is a usb-stick in it.
The server never freezes, only the SOF messages appear on the display (and no boot, or disk)
Updated by Gary Mills 6 months ago
I reported a similar bug in 2017. It's still open. The bug report is here:
https://www.illumos.org/issues/8684
Here's my workaround:
I found a workaround that enabled me to boot and run the hipster BE that I upgraded last month. The BIOS of this system contained an item called `HPET Support'. It was enabled. I disabled it. After that change, the SOF error messages did not appear. The the USB keyboard and USB mouse worked normally.
Updated by Jürgen Bereuter 6 months ago
Unfortunately, switching HPET off in BIOS makes no difference on the DL20 - no usb memory stick (internal usb port). Keyboard and mouse are working (with HPET on) , they are attached to the front usb.
Updated by Peter Kelm 3 months ago
Not sure whether this helps but same here on our new Microservers (Gen 10+).