Project

General

Profile

Bug #6919

VirtualBox guest start crashes OpenIndiana host (/hipster release)

Added by Predrag Zečević almost 4 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
2016-04-18
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

Hi all,

Current Illumos version:

$ uname -rosv
SunOS 5.11 illumos-258e862 Solaris

drives VB guests w/o problems... As soon as I upgrade to something released later (e.g. SunOS Release 5.11 Version illumos-a526879 64-bit) and start VB guest, whole systems frozes and reboots (it runs fine w/o running VB guests)....
Syslogs shows:
[2016-04-15 15:58:39] solarix savecore: [ID 570001 auth.error] reboot after panic: BAD TRAP: type=e (#pf Page fault) rp=ffffff0012579940 addr=fffffd7ffb2ff660 occurred in module "<unknown>" due to an illegal access to a user add
ress
[2016-04-15 15:58:39] solarix savecore: [ID 105003 auth.error] Saving compressed system crash dump in /export/tmp/crash/vmdump.1
[2016-04-15 15:58:39] solarix genunix: [ID 864859 kern.notice] NOTICE: fmd[987]: missing privilege "sys_mount" (euid = 0, syscall = 255) needed at secpolicy_fs_owner+0x2e#012
[2016-04-15 15:58:39] solarix genunix: [ID 864859 kern.notice] NOTICE: hald-addon-stora[1171]: missing privilege "sys_mount" (euid = 0, syscall = 255) needed at secpolicy_fs_owner+0x2e#012
[2016-04-15 15:58:39] solarix mac: [ID 435574 kern.info] NOTICE: rge0 link up, 1000 Mbps, full duplex
[2016-04-15 15:58:40] solarix /sbin/dhcpagent[124]: [ID 778557 daemon.warning] configure_v4_lease: no IP broadcast specified for rge0, making best guess
[2016-04-15 15:58:40] solarix in.routed[885]: [ID 847162 daemon.error] 4 bytes of routing message left over
[2016-04-15 15:58:40] solarix in.routed[885]: [ID 606928 daemon.warning] sendto(rge0, 224.0.0.2): Network is unreachable
[2016-04-15 15:58:41] solarix ipf: [ID 774698 kern.info] IP Filter: v4.1.9, running.
[2016-04-15 15:58:41] solarix /sbin/dhcpagent[124]: [ID 732872 daemon.error] dhcp_bound_complete: cannot add default router 192.168.222.1 on rge0: File exists
[2016-04-15 15:58:43] solarix savecore: [ID 606186 auth.error] Decompress the crash dump with #012'savecore -vf /export/tmp/crash/vmdump.1'
[2016-04-15 15:58:43] solarix fmd: [ID 377184 daemon.error] SUNW-MSG-ID: SUNOS-8000-KL, TYPE: Defect, VER: 1, SEVERITY: Major#012EVENT-TIME: Fri Apr 15 15:58:43 CEST 2016#012PLATFORM: To-Be-Filled-By-O.E.M., CSN: -, HOSTNAME: solarix#012SOURCE: software-diagnosis, REV: 0.1#012EVENT-ID: e701b777-689f-6be0-952a-9e382a0c23e5#012DESC: The system has rebooted after a kernel panic.  Refer to http://illumos.org/msg/SUNOS-8000-KL for more information.#012AUTO-RESPONSE: The failed system image was dumped to the dump device.  If savecore is enabled (see dumpadm(1M)) a copy of the dump will be written to the savecore directory /export/tmp/crash.#012IMPACT: There may be some performance impact while the panic is copied to the savecore directory.  Disk space usage by panics can be substantial.#012REC-ACTION: If savecore is not enabled then please take steps to preserve the crash image.#012Use 'fmdump -Vp -u e701b777-689f-6be0-952a-9e382a0c23e5' to view more panic detail.  Please refer to the knowledge article for additional information.

Moreover:

So, I have ran:

$ savecore -vf /export/tmp/crash/vmdump.1
savecore: System dump time: Fri Apr 15 15:57:11 2016

savecore: saving system crash dump in /export/tmp/crash/{unix,vmcore}.1
Constructing namelist /export/tmp/crash/unix.1
Constructing corefile /export/tmp/crash/vmcore.1
 0:02 100% done: 249563 of 249563 pages saved
5814 (2%) zero pages were not written
0:02 dump decompress is done

And tried to analyze with debugger...
$ mdb -k unix.1 vmcore.1
mdb: warning: dump is from SunOS 5.11 illumos-a526879; dcmds and macros may not match kernel implementation
Loading modules: [ unix genunix specfs dtrace mac cpu.generic uppc apix scsi_vhci zfs sata sd ip hook neti sockfs arp usba uhci fctl stmf stmf_sbd idm cpc crypto md fcip fcp random lofs ufs logindmux nsmb ptm smbsrv nfs sppp ipc ]

> ::status
debugging crash dump vmcore.1 (64-bit) from solarix
operating system: 5.11 illumos-a526879 (i86pc)
image uuid: e701b777-689f-6be0-952a-9e382a0c23e5
panic message: BAD TRAP: type=e (#pf Page fault) rp=ffffff0012579940 addr=fffffd7ffb2ff660 occurred in module "<unknown>" due to an illegal access to a user address
dump content: kernel pages only

> ::stack
0xfffffffff8d0a225()
0xfffffffff8d08bf8()
0xfffffffff8d933d7()
0xfffffffff8d091ea()
supdrvIOCtl+0x17fd()
VBoxDrvSolarisIOCtl+0x36c()
cdev_ioctl+0x39(11f00000002, c0185687, fffffd7ffcdfed10, 202003, ffffff02f36321c8, ffffff0012579ea8)
spec_ioctl+0x60(ffffff02f6798940, c0185687, fffffd7ffcdfed10, 202003, ffffff02f36321c8, ffffff0012579ea8)
fop_ioctl+0x55(ffffff02f6798940, c0185687, fffffd7ffcdfed10, 202003, ffffff02f36321c8, ffffff0012579ea8)
ioctl+0x9b(11, ffffffffc0185687, fffffd7ffcdfed10)
sys_syscall+0x17a()

> ::msgbuf
MESSAGE                                                               
NOTICE: dbus-daemon[2319]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c

NOTICE: dbus-daemon[2299]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c

NOTICE: dbus-daemon[2319]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c

NOTICE: dbus-daemon[2299]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c

NOTICE: dbus-daemon[2319]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c

NOTICE: dbus-daemon[2299]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c

NOTICE: dbus-daemon[2299]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c

NOTICE: metacity[2314]: missing privilege "proc_owner" (euid = 50, syscall = 5) needed at secpolicy_proc_access+0x20

NOTICE: metacity[2314]: missing privilege "proc_owner" (euid = 50, syscall = 5) needed at secpolicy_proc_access+0x20

NOTICE: dbus-daemon[2299]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c

NOTICE: dbus-daemon[2299]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c

NOTICE: metacity[2314]: missing privilege "proc_owner" (euid = 50, syscall = 5) needed at secpolicy_proc_access+0x20

NOTICE: metacity[2314]: missing privilege "proc_owner" (euid = 50, syscall = 5) needed at secpolicy_proc_access+0x20

NOTICE: dbus-daemon[2299]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c

NOTICE: e1000g0 unregistered
pseudo-device: devinfo0
devinfo0 is /pseudo/devinfo@0
NOTICE: dbus-daemon[2299]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c

NOTICE: Xorg[2073]: missing privilege "ALL" (euid = 2903, syscall = 6) needed at zfs_zaccess+0x169

NOTICE: Xorg[2073]: missing privilege "ALL" (euid = 2903, syscall = 6) needed at zfs_zaccess+0x169

NOTICE: Xorg[2073]: missing privilege "file_owner" (euid = 2903, syscall = 6) for "modify file times" needed at secpolicy_vnode_utime_modify+0x24

NOTICE: Xorg[2073]: missing privilege "file_dac_search" (euid = 2903, syscall = 88) needed at zfs_zaccess+0x169

NOTICE: e1000g0 registered
NOTICE: softmac1000 unregistered
cpu_t::cpu_runrun @ 0xd8 (216)
cpu_t::cpu_kprunrun @ 0xd9 (217)
kthread_t::t_preempt @ 0x2a (42)
kthread_t::t_did @ 0x110 (272)
kthread_t::t_intr @ 0x100 (256)
kthread_t::t_lockp @ 0x1f8 (504)
kthread_t::t_procp @ 0x190 (400)
tsc::mode Invariant @ tentative 3191770797 Hz
pseudo-device: vboxdrv0
vboxdrv0 is /pseudo/vboxdrv@0
vboxnet0: vboxnet: type "ether" mac address 08:00:27:7e:95:f2
pseudo-device: vboxnet0
vboxnet0 is /pseudo/vboxnet@0
pseudo-device: vboxnet0
vboxnet0 is /pseudo/vboxnet@0
NOTICE: softmac1007 registered
NOTICE: softmac1007 link up, 1000 Mbps, full duplex
pseudo-device: vboxflt0               
vboxflt0 is /pseudo/vboxflt@0
pseudo-device: vboxusbmon0
vboxusbmon0 is /pseudo/vboxusbmon@0
ISA-device: ecpp0
ecpp0 is /pci@0,0/isa@1f/lp@1,378
ISA-device: asy0
asy0 is /pci@0,0/isa@1f/asy@1,3f8
pseudo-device: dcpc0
dcpc0 is /pseudo/dcpc@0
pseudo-device: fbt0
fbt0 is /pseudo/fbt@0
pseudo-device: fcp0
fcp0 is /pseudo/fcp@0
pseudo-device: fcsm0
fcsm0 is /pseudo/fcsm@0
pseudo-device: fct0
fct0 is /pseudo/fct@0
pseudo-device: llc10
llc10 is /pseudo/llc1@0
pseudo-device: lockstat0
lockstat0 is /pseudo/lockstat@0
pseudo-device: lofi0
lofi0 is /pseudo/lofi@0
pseudo-device: profile0
profile0 is /pseudo/profile@0
pseudo-device: ramdisk1024
ramdisk1024 is /pseudo/ramdisk@1024
pseudo-device: sdt0
sdt0 is /pseudo/sdt@0
pseudo-device: stmf0
stmf0 is /pseudo/stmf@0
pseudo-device: systrace0
systrace0 is /pseudo/systrace@0
pseudo-device: ucode0
ucode0 is /pseudo/ucode@0
pseudo-device: bpf0
bpf0 is /pseudo/bpf@0
pseudo-device: fssnap0
fssnap0 is /pseudo/fssnap@0
IP Filter: v4.1.9, running.
pseudo-device: nsmb0
nsmb0 is /pseudo/nsmb@0
pseudo-device: pm0
pm0 is /pseudo/pm@0
pseudo-device: winlock0
winlock0 is /pseudo/winlock@0
pseudo-device: signalfd0
signalfd0 is /pseudo/signalfd@0
pseudo-device: timerfd0
timerfd0 is /pseudo/timerfd@0
vboxdrv: fffffffff8ce6020 VMMR0.r0
vboxdrv: fffffffff848a020 VBoxDDR0.r0
vboxdrv: fffffffff8a86020 VBoxDD2R0.r0

panic[cpu0]/thread=ffffff02d5860040: 
BAD TRAP: type=e (#pf Page fault) rp=ffffff0012579940 addr=fffffd7ffb2ff660 occurred in module "<unknown>" due to an illegal access to a user address

VirtualBox: 
#pf Page fault
Bad kernel fault at addr=0xfffffd7ffb2ff660
pid=3391, pc=0xfffffffff8d0a225, sp=0xffffff0012579a30, eflags=0x10202
cr0: 80050033<pg,wp,ne,et,mp,pe> cr4: 3406f8<smap,smep,osxsav,xmme,fxsr,pge,mce,pae,pse,de>
cr2: fffffd7ffb2ff660
cr3: 18dc04000
cr8: 0

        rdi: fffffd7ffb2ff320 rsi:                1 rdx:               34
        rcx: ffffff0012579a4f  r8:              340  r9:               34
        rax:               1a rbx:                0 rbp: ffffff0012579a60
        r10:                0 r11:               1a r12: ffffff02fc11f000
        r13: ffffff02fc11f000 r14:                0 r15:                0
        fsb: fffffd7fff106a40 gsb: fffffffffbc32620  ds:               4b
         es:               4b  fs:                0  gs:                0
        trp:                e err:                1 rip: fffffffff8d0a225
         cs:               30 rfl:            10202 rsp: ffffff0012579a30
         ss:               38

ffffff0012579820 unix:die+df ()
ffffff0012579930 unix:trap+1490 ()
ffffff0012579940 unix:_cmntrap+e6 ()
ffffff0012579a60 fffffffff8d0a225 ()
ffffff0012579ab0 fffffffff8d08bf8 ()
ffffff0012579ad0 fffffffff8d933d7 ()
ffffff0012579b10 fffffffff8d091ea ()
ffffff0012579c10 vboxdrv:supdrvIOCtl+17fd ()
ffffff0012579cc0 vboxdrv:VBoxDrvSolarisIOCtl+36c ()
ffffff0012579d00 genunix:cdev_ioctl+39 ()
ffffff0012579d50 specfs:spec_ioctl+60 ()
ffffff0012579de0 genunix:fop_ioctl+55 ()
ffffff0012579f00 genunix:ioctl+9b ()
ffffff0012579f10 unix:brand_sys_syscall+1f5 ()

syncing file systems...
 done
dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
NOTICE: ahci0: ahci_tran_reset_dport port 0 reset port

> ::cpuinfo -v
 ID ADDR             FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD           PROC
  0 fffffffffbc3cf20  1b    0    0  59   no    no t-1    ffffff02d5860040 VirtualBox
                       |    
            RUNNING <--+    
              READY         
             EXISTS         
             ENABLE         

 ID ADDR             FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD           PROC
  1 ffffff02d4b3b580  1f    0    0  -1   no    no t-0    ffffff000f4f2c40 (idle)
                       |    
            RUNNING <--+    
              READY         
           QUIESCED         
             EXISTS         
             ENABLE         

 ID ADDR             FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD           PROC
  2 ffffff02d4b35500  1f    0    0  -1   no    no t-0    ffffff000f61ec40 (idle)
                       |    
            RUNNING <--+    
              READY         
           QUIESCED         
             EXISTS         
             ENABLE         

 ID ADDR             FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD           PROC
  3 ffffff02d4b34000  1f    1    0  -1   no    no t-0    ffffff000f6dcc40 (idle)
                       |    |
            RUNNING <--+    +-->  PRI THREAD           PROC
              READY                60 ffffff00104e7c40 sched
           QUIESCED         
             EXISTS         
             ENABLE

> ::panicinfo
             cpu                0
          thread ffffff02d5860040
         message BAD TRAP: type=e (#pf Page fault) rp=ffffff0012579940 addr=fffffd7ffb2ff660 occurred in module "<unknown>" due to an illegal access to a user address
             rdi fffffd7ffb2ff320
             rsi                1
             rdx               34
             rcx ffffff0012579a4f
              r8              340
              r9               34
             rax               1a
             rbx                0
             rbp ffffff0012579a60
             r10                0
             r11               1a
             r12 ffffff02fc11f000
             r13 ffffff02fc11f000
             r14                0
             r15                0
          fsbase fffffd7fff106a40
          gsbase fffffffffbc32620
              ds               4b
              es               4b
              fs                0
              gs                0
          trapno                e
             err                1
             rip fffffffff8d0a225
              cs               30
          rflags            10202
             rsp ffffff0012579a30
              ss               38
          gdt_hi                0
          gdt_lo         e00001ef
          idt_hi                0
          idt_lo         d0000fff
             ldt                0
            task               70
             cr0         80050033
             cr2 fffffd7ffb2ff660
             cr3        18dc04000
             cr4           3406f8

System is brand new one (I had to add extra Graphic, Network and USB 2.0 PCIe cards to have OI working). Here extracts from 'dmidecode' utility:

BIOS Information
        Vendor: American Megatrends Inc.
        Version: P1.20
        Release Date: 09/03/2015
        Characteristics:
                PCI is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                BIOS ROM is socketed
                EDD is supported
                5.25"/1.2 MB floppy services are supported (int 13h)
                3.5"/720 kB floppy services are supported (int 13h)
                3.5"/2.88 MB floppy services are supported (int 13h)
                Print screen service is supported (int 5h)
                8042 keyboard services are supported (int 9h)
                Serial services are supported (int 14h)
                Printer services are supported (int 17h)
                ACPI is supported
                USB legacy is supported
                BIOS boot specification is supported
                Targeted content distribution is supported
                UEFI is supported

Base Board Information
        Manufacturer: ASRock
        Product Name: B150M Pro4S

Processor Information
        Socket Designation: CPUSocket
        Type: Central Processor
        Family: Core i5
        Manufacturer: Intel(R) Corporation
        ID: E3 06 05 00 FF FB EB BF
        Signature: Type 0, Family 6, Model 94, Stepping 3
        Flags:
                FPU (Floating-point unit on-chip)
                VME (Virtual mode extension)
                DE (Debugging extension)
                PSE (Page size extension)
                TSC (Time stamp counter)
                MSR (Model specific registers)
                PAE (Physical address extension)
                MCE (Machine check exception)
                CX8 (CMPXCHG8 instruction supported)
                APIC (On-chip APIC hardware supported)
                SEP (Fast system call)
                MTRR (Memory type range registers)
                PGE (Page global enable)
                MCA (Machine check architecture)
                CMOV (Conditional move instruction supported)
                PAT (Page attribute table)
                PSE-36 (36-bit page size extension)
                CLFSH (CLFLUSH instruction supported)
                DS (Debug store)
                ACPI (ACPI supported)
                MMX (MMX technology supported)
                FXSR (FXSAVE and FXSTOR instructions supported)
                SSE (Streaming SIMD extensions)
                SSE2 (Streaming SIMD extensions 2)
                SS (Self-snoop)
                HTT (Multi-threading)
                TM (Thermal monitor supported)
                PBE (Pending break enabled)
        Version: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz

Memory Device
        Array Handle: 0x0010
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 8192 MB
        Form Factor: DIMM
        Set: None
        Locator: ChannelA-DIMM1
        Bank Locator: BANK 1
        Type: DDR4
        Type Detail: Synchronous
        Speed: 2133 MHz
        Manufacturer: 1315 

OI /hipster is installed to SSD, whilst HD is used for user HOME directories:
AVAILABLE DISK SELECTIONS:
       0. c2t0d0 <Samsung-SSD 850 EVO 250GB-EMT01B6Q cyl 28078 alt 2 hd 224 sec 56>
          /pci@0,0/pci1849,a102@17/disk@0,0
       1. c2t1d0 <ATA-ST1000DX001-1NS1-CC41 cyl 60797 alt 2 hd 255 sec 126>
          /pci@0,0/pci1849,a102@17/disk@1,0

I have attached screenshot from "device driver utility"....

I know that this HW is not covered by any HCL, but as I want to stress: it has worked like charm (very fast and supporting several VMs at the same time) UNTIL some changes have been introduced to recent kernel versions or OI structure.
To me it is obvious that some changes makes crash when VBguest tries to load module (most suspisious if appearance of e1000g which is not showed in current kernel version... This is from currently running (and well operation) version:

$ scanpci 

pci bus 0x0000 cardnum 0x00 function 0x00: vendor 0x8086 device 0x191f
 Intel Corporation Sky Lake Host Bridge/DRAM Registers

pci bus 0x0000 cardnum 0x01 function 0x00: vendor 0x8086 device 0x1901
 Intel Corporation Sky Lake PCIe Controller (x16)

pci bus 0x0001 cardnum 0x00 function 0x00: vendor 0x10de device 0x0605
 NVIDIA Corporation G92 [GeForce 9800 GT]

pci bus 0x0000 cardnum 0x14 function 0x00: vendor 0x8086 device 0xa12f
 Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller

pci bus 0x0000 cardnum 0x14 function 0x02: vendor 0x8086 device 0xa131
 Intel Corporation Sunrise Point-H Thermal subsystem

pci bus 0x0000 cardnum 0x16 function 0x00: vendor 0x8086 device 0xa13a
 Intel Corporation Sunrise Point-H CSME HECI #1

pci bus 0x0000 cardnum 0x17 function 0x00: vendor 0x8086 device 0xa102
 Intel Corporation Device unknown

pci bus 0x0000 cardnum 0x1c function 0x00: vendor 0x8086 device 0xa114
 Intel Corporation Sunrise Point-H PCI Express Root Port #5

pci bus 0x0002 cardnum 0x00 function 0x00: vendor 0x12d8 device 0xe111
 Pericom Semiconductor PI7C9X111SL PCIe-to-PCI Reversible Bridge

pci bus 0x0003 cardnum 0x04 function 0x00: vendor 0x1106 device 0x3038
 VIA Technologies, Inc. VT82xx/62xx UHCI USB 1.1 Controller

pci bus 0x0003 cardnum 0x04 function 0x01: vendor 0x1106 device 0x3038
 VIA Technologies, Inc. VT82xx/62xx UHCI USB 1.1 Controller

pci bus 0x0003 cardnum 0x04 function 0x02: vendor 0x1106 device 0x3104
 VIA Technologies, Inc. USB 2.0

pci bus 0x0000 cardnum 0x1d function 0x00: vendor 0x8086 device 0xa118
 Intel Corporation Sunrise Point-H PCI Express Root Port #9

pci bus 0x0000 cardnum 0x1d function 0x03: vendor 0x8086 device 0xa11b
 Intel Corporation Sunrise Point-H PCI Express Root Port #12

pci bus 0x0005 cardnum 0x00 function 0x00: vendor 0x10ec device 0x8168
 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller

pci bus 0x0000 cardnum 0x1f function 0x00: vendor 0x8086 device 0xa148
 Intel Corporation Sunrise Point-H LPC Controller

pci bus 0x0000 cardnum 0x1f function 0x02: vendor 0x8086 device 0xa121
 Intel Corporation Sunrise Point-H PMC

pci bus 0x0000 cardnum 0x1f function 0x03: vendor 0x8086 device 0xa170
 Intel Corporation Sunrise Point-H HD Audio

pci bus 0x0000 cardnum 0x1f function 0x04: vendor 0x8086 device 0xa123
 Intel Corporation Sunrise Point-H SMBus

pci bus 0x0000 cardnum 0x1f function 0x06: vendor 0x8086 device 0x15b8
 Intel Corporation Ethernet Connection (2) I219-V

Not sure what else I can paste here...

Regards.


Files

DDU.png (79 KB) DDU.png device driver utility screen snapshot Predrag Zečević, 2016-04-18 09:37 AM

Related issues

Has duplicate OpenIndiana Distribution - Bug #10602: Virtualbox crashes the systemResolved2019-03-28

Actions

History

#1

Updated by Nikola M. almost 4 years ago

VirtualBox people says it looks like 'Supervisor Mode Access Prevention' (SMAP) problem. (https://en.wikipedia.org/wiki/Supervisor_Mode_Access_Prevention), VirtualBox disables SMAP only for Solaris builds (SMAP support works on Linux, OSX) , because Oracle Solaris disables SMAP.

So aether illumos after illumos-a526879 needs an option to disable SMAP for VirtualBox compatibility
or VirtualBox needs to handle SMAP on Solaris build that runs on illumos and VirtualBox doesn't have illumos-specific build.

Proposed workaround from tsoome:
disable_smap/W1 in mdb -k , disable_smap is int, so /W should be ok (to write 4byte value)

Please try it and report if it worked, so it is know if it is truly SMAP issue

#2

Updated by Predrag Zečević almost 4 years ago

It was working (and still does) in older kernel...

What has been changed?

I will give it a try during day and report here (will also use new /hipster repository).

Engaging google, found that this might be proper command:

echo disable_smap/W1 | mdb -kw

Thanks for investigation and update.
Regards.

#3

Updated by Predrag Zečević almost 4 years ago

So, tried - but no luck!

Booted into:

[2016-04-19 15:56:35] solarix genunix: [ID 540533 kern.notice] #015SunOS Release 5.11 Version illumos-380fd67 64-bit

Disabled SMAP:
$ echo disable_smap/W1 | pfexec mdb -kw
disable_smap:   0               =       0x1

Started vB guest:
[2016-04-19 15:56:39] solarix savecore: [ID 188441 auth.error] Decompress the crash dump with #012'savecore -vf /export/tmp/crash/vmdump.5'
[2016-04-19 15:56:39] solarix fmd: [ID 377184 daemon.error] SUNW-MSG-ID: SUNOS-8000-KL, TYPE: Defect, VER: 1, SEVERITY: Major#012EVENT-TIME: Tue Apr 19 15:56:39 CEST 2016#012PLATFORM: To-Be-Filled-By-O.E.M., CSN: -, HOSTNAME: solarix#012SOURCE: software-diagnosis, REV: 0.1#012EVENT-ID: 8fc710d4-175b-c682-accb-b865aaaf19dd#012DESC: The system has rebooted after a kernel panic.  Refer to http://illumos.org/msg/SUNOS-8000-KL for more information.#012AUTO-RESPONSE: The failed system image was dumped to the dump device.  If savecore is enabled (see dumpadm(1M)) a copy of the dump will be written to the savecore directory /export/tmp/crash.#012IMPACT: There may be some performance impact while the panic is copied to the savecore directory.  Disk space usage by panics can be substantial.#012REC-ACTION: If savecore is not enabled then please take steps to preserve the crash image.#012Use 'fmdump -Vp -u 8fc710d4-175b-c682-accb-b865aaaf19dd' to view more panic detail.  Please refer to the knowledge article for additional information.

So, identical problem...

Regards.

#4

Updated by Robert Mustacchi almost 4 years ago

Using mdb that way to toggle SMAP is far too late. You'll want to put it in /etc/system and reboot.

However, to make sure that I understand what's going on, are you trying to run virtual box on an illumos system with illumos as a host or are you running illumos inside of virtualbox?

#5

Updated by Robert Mustacchi almost 4 years ago

Also, in those addresses in the panic, what does whatis say they are? Where's the source code to this vbox driver in question?

#6

Updated by Predrag Zečević almost 4 years ago

Hi Robert,

ilumos (OI /hipster) in version from 29.03.2016 works and OS updates later, causes that any type of VB guest (I use: Win, Linux) crashes host...

BTW. What would be /etc/system syntax for disable_smap then?

Thanks and regards.

#7

Updated by Nikola M. almost 4 years ago

Please try putting in /etc/system:

set disable_smap=1

and see now it is doing with newer illumos.

#8

Updated by Predrag Zečević almost 4 years ago

Bingo!

that has definitelly helped!

Let me try to use system for a while, before closing ticket.
Many thanks :-)

Regards.

#9

Updated by Robert Mustacchi almost 4 years ago

It'd still be useful to get the ::whatis output on those addresses so we can confirm that the issue is that the vbox kernel module isn't using DDI based methods to try and access user data.

#10

Updated by Predrag Zečević almost 4 years ago

I am not expert for mdb, but here some data:

$ mdb -k unix.5 vmcore.5
Loading modules: [ unix genunix specfs dtrace mac cpu.generic uppc apix scsi_vhci zfs sata sd ip hook neti sockfs arp usba uhci fctl stmf stmf_sbd mm idm cpc crypto md fcip fcp random lofs ufs logindmux nsmb ptm smbsrv nfs sppp ipc ]
> ::cpuinfo -v
 ID ADDR             FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD           PROC
  0 fffffffffbc32620  1f    0    0  -1   no    no t-0    ffffff000f205c40 (idle)
                       |    
            RUNNING <--+    
              READY         
           QUIESCED         
             EXISTS         
             ENABLE         

 ID ADDR             FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD           PROC
  1 fffffffffbc3cf20  1b    0    0  59   no    no t-1    ffffff02df0f0440 VirtualBox
                       |    
            RUNNING <--+    
              READY         
             EXISTS         
             ENABLE         

 ID ADDR             FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD           PROC
  2 ffffff02d791f080  1f    0    0  -1   no    no t-0    ffffff000f64bc40 (idle)
                       |    
            RUNNING <--+    
              READY         
           QUIESCED         
             EXISTS         
             ENABLE         

 ID ADDR             FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD           PROC
  3 ffffff02d7a72540  1f    1    0  -1   no    no t-0    ffffff000f8d7c40 (idle)
                       |    |
            RUNNING <--+    +-->  PRI THREAD           PROC
              READY                60 ffffff000f313c40 sched
           QUIESCED         
             EXISTS         
             ENABLE         

> fffffffffbc3cf20::whatis
fffffffffbc3cf20 is panic_cpu, in unix's data segment
> ffffff000f64bc40::whatis
ffffff000f64bc40 is allocated as a thread structure
> ::panicinfo
             cpu                1
          thread ffffff02df0f0440
         message BAD TRAP: type=e (#pf Page fault) rp=ffffff00127ff940 addr=fffffd7ffcbaf660 occurred in module "<unknown>" due to an illegal access to a user address
             rdi fffffd7ffcbaf320
             rsi                1
             rdx               34
             rcx ffffff00127ffa4f
              r8              340
              r9               34
             rax               1a
             rbx                0
             rbp ffffff00127ffa60
             r10                0
             r11               1a
             r12 ffffff030534b000
             r13 ffffff030534b000
             r14                0
             r15                0
          fsbase fffffd7fff106a40
          gsbase ffffff02d7920580
              ds               4b
              es               4b
              fs                0
              gs                0
          trapno                e
             err                1
             rip fffffffff8cdd225
              cs               30
          rflags            10202
             rsp ffffff00127ffa30
              ss               38
          gdt_hi                0
          gdt_lo         700001ef
          idt_hi                0
          idt_lo         d0000fff
             ldt                0
            task               70
             cr0         80050033
             cr2 fffffd7ffcbaf660
             cr3        188e4e000
             cr4           3406f8
> ffffff02df0f0440::whatis
ffffff02df0f0440 is allocated as a thread structure
> fffffd7ffcbaf660::whatis
fffffd7ffcbaf660 is unknown
> 

If there is more I can do (paste) let me know (together with instructions) what exactly is important here.

Thanks an regards.

#11

Updated by Predrag Zečević almost 4 years ago

Hi all,

disabling SMAP did a job (look above for required settting in /etc/system file)

Tested for few days, with one or more VB guests - no problems so far.

Many thanks to everbody involved in solving this problem.
With best regards.

#12

Updated by Nikola M. almost 4 years ago

Added on-wiki VirtualBox instructions:
http://wiki.openindiana.org/oi/7.2+VirtualBox
('set disable_smap=1' in /etc/system)

#13

Updated by Aurélien Larcher over 2 years ago

  • Status changed from New to Resolved
#14

Updated by Gergő Mihály Doma 10 months ago

  • Has duplicate Bug #10602: Virtualbox crashes the system added

Also available in: Atom PDF