Project

General

Profile

Actions

Bug #14526

open

illumos guest hangs on reboot under QEMU 6.0.0

Added by Joshua M. Clulow 6 months ago. Updated 11 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
kernel
Start date:
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

I installed an OpenIndiana guest in QEMU on my workstation, which runs Ubuntu 21.10. The guest is doing BIOS boot. Every time I try to reboot the guest, it hangs up hard. Asking QEMU to force a reset of the guest appears not to do anything. This is probably at least partly a QEMU or guest firmware bug, but I thought I would file it anyway.

I'm not sure what package versions on Ubuntu are relevant, but perhaps these at least:

 $ apt list | grep -Ei '(seabios|vgabios|qemu-efi|qemu-system-x86)/'
qemu-efi/impish,impish 2021.08~rc0-2 all
qemu-system-x86/impish-updates,now 1:6.0+dfsg-2expubuntu1.1 amd64 [installed,automatic]
seabios/impish,impish,now 1.14.0-2 all [installed,automatic]
vgabios/impish,impish 0.7b+ds-1 all
$ /usr/bin/qemu-system-x86_64 --version
QEMU emulator version 6.0.0 (Debian 1:6.0+dfsg-2expubuntu1.1)
Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers
Actions #1

Updated by Joshua M. Clulow 6 months ago

Here is what I have done to inspect the guest so far. I first edited the libvirt domain configuration XML:

# virsh edit oitest

If the <domain> node does not have the qemu XML namespace define, add it so that it looks like:

<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>

Then, append QEMU arguments to start a GDB server:

...
  <qemu:commandline>
    <qemu:arg value='-gdb'/>
    <qemu:arg value='tcp::54541'/>
  </qemu:commandline>
</domain>

One can then attach GDB and look at the registers at least:

(gdb) target remote localhost:54541
Remote debugging using localhost:54541
warning: No executable has been specified and target does not support
determining executable automatically.  Try using the "file" command.
0xfffffffffb859f32 in ?? ()
(gdb) thread 1
[Switching to thread 1 (Thread 1.1)]
#0  0xfffffffffb859f32 in ?? ()
(gdb) info registers rip
rip            0xfffffffffb859f32  0xfffffffffb859f32
(gdb) thread 2
[Switching to thread 2 (Thread 1.2)]
#0  0xfffffffffb884658 in ?? ()
(gdb) info registers rip
rip            0xfffffffffb884658  0xfffffffffb884658

While we do not have symbols easily in this context, our program text addresses are often fairly predictable, at least for things in unix or genunix, etc. After a reboot, I asked MDB:

> 0xfffffffffb859f32::dis
mach_cpu_pause+0x20:            pause
mach_cpu_pause+0x22:            movzbl (%rbx),%eax
mach_cpu_pause+0x25:            testb  %al,%al

> 0xfffffffffb884658::dis
efi_reset+4:                    lidt   (%rsp)
efi_reset+8:                    int    $0x0
efi_reset+0xa:                  cli

This seems pretty plausible. From the source for efi_reset():

4:
        /
        / port 0xcf9 failed also.  Last-ditch effort is to
        / triple-fault the CPU.
        / Also, use triple fault for EFI firmware
        /
        ENTRY(efi_reset)
        pushq   $0x0
        pushq   $0x0            / IDT base of 0, limit of 0 + 2 unused bytes
        lidt    (%rsp)
        int     $0x0            / Trigger interrupt, generate triple-fault

        cli
        hlt                     / Wait forever
        /*NOTREACHED*/

Ostensibly we have tried to triple fault the guest, and it has become stuck, which feels like it is on QEMU. On the other hand, we should probably be doing something else that actually works before we get down here.

Actions #2

Updated by Joshua M. Clulow 6 months ago

  • Description updated (diff)
Actions #3

Updated by Joshua M. Clulow 6 months ago

Unlike reboot it seems like poweroff does at least stop the VM.

Actions #4

Updated by Garrett D'Amore about 2 months ago

FYI: I have rebooted under KVM / qemu on CentOS 8 numerous times in the past week, no problem.

Actions #5

Updated by Joshua M. Clulow 11 days ago

What QEMU version ships on your CentOS system? Also, are you using BIOS or UEFI boot in the guest? This appears to occur, still, on my updated Ubuntu system; i.e.,

$ qemu-system-x86_64 --version
QEMU emulator version 6.2.0 (Debian 1:6.2+dfsg-2ubuntu6.3)
Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers

In this case it's a legacy/BIOS boot VM.

Actions #6

Updated by Joshua M. Clulow 11 days ago

This also seems to affect new droplets (guests) created at DigitalOcean. It's hard to know what they're doing exactly as it's all a proprietary service, but it seems to be BIOS boot (not UEFI) and from the SMBIOS data:

...
ID    SIZE TYPE
768   40   SMB_TYPE_CHASSIS (type 3) (system enclosure or chassis)

  Manufacturer: QEMU
  Version: pc-i440fx-6.0

Which seems like it might be QEMU 6.0.X?

Actions

Also available in: Atom PDF