Project

General

Profile

Actions

Bug #16402

open

Linode VM panics on boot

Added by Joshua M. Clulow 2 months ago. Updated 2 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:
External Bug:

Description

Using OmniOS r151046, in a 1 CPU, 1GB RAM virtual machine on Linode, we panic in early boot. The serial console web thing is not fantastic but I've been able to lift out some particulars:

> ::msgbuf
...
mem = 1048048K (0x3ff7c000)
TSC calibrated using PIT; freq is 2299 MHz
...
Using default device instance data
WARNING: Request for too much kernel memory (808337408 bytes), will hang forever

panic[cpu0]/thread=fffffffffbca07a0:
mutex_enter: bad mutex, lp=c0 owner=f000ff53f000ff50 thread=fffffffffbca07a0

Warning - stack not written to the dump buffer
fffffffffbca5ec0 unix:mutex_panic+54 ()
fffffffffbca5f30 unix:mutex_vector_enter+40f ()
fffffffffbca5fc0 genunix:timeout_generic+72 ()
fffffffffbca5ff0 genunix:timeout_default+54 ()
fffffffffbca6030 genunix:delay_common+a2 ()
fffffffffbca6080 genunix:delay+40 ()
fffffffffbca6160 unix:page_create_va+82 ()
fffffffffbca61f0 unix:segkmem_page_create+a3 ()
fffffffffbca6290 unix:segkmem_xalloc+150 ()
fffffffffbca6300 unix:segkmem_alloc_vn+3c ()
fffffffffbca6330 unix:segkmem_alloc+20 ()
fffffffffbca6450 genunix:vmem_xalloc+4e9 ()
fffffffffbca64c0 genunix:vmem_alloc+139 ()
fffffffffbca6500 genunix:kmem_alloc+153 ()
fffffffffbca6520 unix:smb_alloc+1a ()
fffffffffbca65b0 unix:smbios_open+2ae ()
fffffffffbca6640 unix:startup_modules+10a ()
fffffffffbca6650 unix:startup+5a ()
fffffffffbca6690 genunix:main+36 ()
fffffffffbca66a0 unix:_locore_start+88 ()
[0]>
Actions #1

Updated by Joshua M. Clulow 2 months ago

The panic stack with arguments:

[0]> $C
fffffffffbc913a0 kmdb_enter+0xb()
fffffffffbc913d0 debug_enter+0x75(fffffffffb942888)
fffffffffbc914c0 panicsys+0x616(fffffffffb96737b, fffffffffbca5e48, fffffffffbc914d0, 1)
fffffffffbca5e30 vpanic+0x15c()
fffffffffbca5ea0 0xfffffffffb8ab251()
fffffffffbca5ec0 mutex_panic+0x54(fffffffffb96868d, c0)
fffffffffbca5f30 mutex_vector_enter+0x40f(c0)
fffffffffbca5fc0 timeout_generic+0x72(1, fffffffffb991460, fffffffffbca07a0, 0, f4240, 0)
fffffffffbca5ff0 timeout_default+0x54(fffffffffb991460, fffffffffbca07a0, 3b9aca00)
fffffffffbca6030 delay_common+0xa2(3b9aca00)
fffffffffbca6080 delay+0x40(3b9aca00)
fffffffffbca6160 page_create_va+0x82(fffffffffbcda880, fffffe01e63a3000, 302e4000, 13, fffffffffbca6170, fffffe01e63a3000)
fffffffffbca61f0 segkmem_page_create+0xa3(fffffe01e63a3000, 302e4000, 0, fffffffffbcda880)
fffffffffbca6290 segkmem_xalloc+0x150(fffffe01e4207000, 0, 302e4000, 0, 0, fffffffffb8af520, fffffffffbcda880)
fffffffffbca6300 segkmem_alloc_vn+0x3c(fffffe01e4207000, 302e4000, 0, fffffffffbcda880)
fffffffffbca6330 segkmem_alloc+0x20(fffffe01e4207000, 302e4000, 0)
fffffffffbca6450 vmem_xalloc+0x4e9(fffffe01e420d000, 302e3320, 1000, 0, 0, 0, 0, fffffe0100000000)
fffffffffbca64c0 vmem_alloc+0x139(fffffe01e420d000, 302e3320, 0)
fffffffffbca6500 kmem_alloc+0x153(302e3320, 0)
fffffffffbca6520 smb_alloc+0x1a(302e3320)
fffffffffbca65b0 smbios_open+0x2ae(0, 306, 0, 0)
fffffffffbca6640 startup_modules+0x10a()
fffffffffbca6650 startup+0x5a()
fffffffffbca6690 main+0x36()          
fffffffffbca66a0 _locore_start+0x88()

Note that the number complained about (808337408 bytes) is 0x302e4000 in hex, which we see in the segkmem_alloc() arguments in the stack.

I have nopped out smbios_open():

[0]> smbios_open/v 31 c0 c3           
smbios_open:    0x55    =       0x31
smbios_open+1:  0x48    =       0xc0
smbios_open+2:  0x89    =       0xc3
[0]> smbios_open::dis
smbios_open:                    xorl   %eax,%eax
smbios_open+2:                  ret    

We then get stuck trying to locate the root disk:

NOTICE: Cannot read the pool label from '/pseudo/lofi@1:b'
NOTICE: spa_import_rootpool: error 5
Cannot mount root on /pseudo/lofi@1:b fstype zfs

panic[cpu0]/thread=fffffffffbca07a0: 
vfs_mountroot: cannot mount root

Warning - stack not written to the dump buffer
fffffffffbca6650 fffffffffbaf3567 ()
fffffffffbca6690 genunix:main+137 ()
fffffffffbca66a0 unix:_locore_start+88 ()

This is probably because it's exposing a Virtio SCSI device and I may not have put that driver in this image:

fffffe01e5c4dc28 i86pc (driver name: rootnex)
    fffffe01e6307e28 scsi_vhci, instance #0 (driver name: scsi_vhci)
    fffffe01e5c4d958 ramdisk, instance #0 (driver name: ramdisk)
    fffffe01e5c4d688 pci, instance #0 (driver name: pci)
        fffffe01e5c4d3b8 pci8086,29c0, instance #0 (driver name: agptarget)
        fffffe01e5c4d0e8 pciclass,030000, instance #0 (driver name: vgatext)
        fffffe01e5c4ce18 pci1af4,8 (driver not attached) -------------------------- vioscsi should go here
        fffffe01e5c4cb48 pci1af4,8 (driver not attached)
        fffffe01e5c4c878 pci1af4,1, instance #0 (driver name: vioif)
        fffffe01e5c4c5a8 pciclass,060100, instance #0 (driver name: isa)
            fffffe01e5c465b0 i8042, instance #0 (driver name: i8042)
                fffffe01e5c462e0 pnpPNP,303 (driver not attached)
                fffffe01e5c46010 pnpPNP,f03 (driver not attached)
            fffffe01e6308c38 asy, instance #0 (driver name: asy)
            fffffe01e6300c48 pit_beep, instance #0 (driver name: pit_beep)
        fffffe01e5c4c2d8 pciclass,010601, instance #0 (driver name: ahci)
        fffffe01e5c4c008 pci1af4,1100 (driver not attached)

I switched the "configuration profile" to use "Fully virtualisation" instead of "Paravirtualisation", which appears to tell the hypervisor to expose the disks as ancient IDE devices. After nopping out smbios_open() again, this at least booted to a login prompt!

Actions #2

Updated by Jorge Schrauwen 2 months ago

I hit this (was my prod VM so I had to move i to freebsd)

It seems they did recently bump there QEMU-KVM and Google SeaBIOS versions but they were unwilling to provide more info, aside that it's not exactly clean vanilla upstream.

OmniOS bloody does have the vioscisi driver but, but it's not in any of the install images.

I installed via Full Virt, then installed the vioscsi driver, touched the reconfigure, switched to para virt and booted again. When I installed originally.

Actions

Also available in: Atom PDF