Bug #6919
closedVirtualBox guest start crashes OpenIndiana host (/hipster release)
0%
Description
Hi all,
Current Illumos version:
$ uname -rosv SunOS 5.11 illumos-258e862 Solaris
drives VB guests w/o problems... As soon as I upgrade to something released later (e.g. SunOS Release 5.11 Version illumos-a526879 64-bit) and start VB guest, whole systems frozes and reboots (it runs fine w/o running VB guests)....
Syslogs shows:
[2016-04-15 15:58:39] solarix savecore: [ID 570001 auth.error] reboot after panic: BAD TRAP: type=e (#pf Page fault) rp=ffffff0012579940 addr=fffffd7ffb2ff660 occurred in module "<unknown>" due to an illegal access to a user add ress [2016-04-15 15:58:39] solarix savecore: [ID 105003 auth.error] Saving compressed system crash dump in /export/tmp/crash/vmdump.1 [2016-04-15 15:58:39] solarix genunix: [ID 864859 kern.notice] NOTICE: fmd[987]: missing privilege "sys_mount" (euid = 0, syscall = 255) needed at secpolicy_fs_owner+0x2e#012 [2016-04-15 15:58:39] solarix genunix: [ID 864859 kern.notice] NOTICE: hald-addon-stora[1171]: missing privilege "sys_mount" (euid = 0, syscall = 255) needed at secpolicy_fs_owner+0x2e#012 [2016-04-15 15:58:39] solarix mac: [ID 435574 kern.info] NOTICE: rge0 link up, 1000 Mbps, full duplex [2016-04-15 15:58:40] solarix /sbin/dhcpagent[124]: [ID 778557 daemon.warning] configure_v4_lease: no IP broadcast specified for rge0, making best guess [2016-04-15 15:58:40] solarix in.routed[885]: [ID 847162 daemon.error] 4 bytes of routing message left over [2016-04-15 15:58:40] solarix in.routed[885]: [ID 606928 daemon.warning] sendto(rge0, 224.0.0.2): Network is unreachable [2016-04-15 15:58:41] solarix ipf: [ID 774698 kern.info] IP Filter: v4.1.9, running. [2016-04-15 15:58:41] solarix /sbin/dhcpagent[124]: [ID 732872 daemon.error] dhcp_bound_complete: cannot add default router 192.168.222.1 on rge0: File exists [2016-04-15 15:58:43] solarix savecore: [ID 606186 auth.error] Decompress the crash dump with #012'savecore -vf /export/tmp/crash/vmdump.1' [2016-04-15 15:58:43] solarix fmd: [ID 377184 daemon.error] SUNW-MSG-ID: SUNOS-8000-KL, TYPE: Defect, VER: 1, SEVERITY: Major#012EVENT-TIME: Fri Apr 15 15:58:43 CEST 2016#012PLATFORM: To-Be-Filled-By-O.E.M., CSN: -, HOSTNAME: solarix#012SOURCE: software-diagnosis, REV: 0.1#012EVENT-ID: e701b777-689f-6be0-952a-9e382a0c23e5#012DESC: The system has rebooted after a kernel panic. Refer to http://illumos.org/msg/SUNOS-8000-KL for more information.#012AUTO-RESPONSE: The failed system image was dumped to the dump device. If savecore is enabled (see dumpadm(1M)) a copy of the dump will be written to the savecore directory /export/tmp/crash.#012IMPACT: There may be some performance impact while the panic is copied to the savecore directory. Disk space usage by panics can be substantial.#012REC-ACTION: If savecore is not enabled then please take steps to preserve the crash image.#012Use 'fmdump -Vp -u e701b777-689f-6be0-952a-9e382a0c23e5' to view more panic detail. Please refer to the knowledge article for additional information.Moreover:
- http://illumos.org/msg/SUNOS-8000-KL is an empty page
- fmdump -Vp -u e701b777-689f-6be0-952a-9e382a0c23e5 was not returning anything
So, I have ran:
$ savecore -vf /export/tmp/crash/vmdump.1 savecore: System dump time: Fri Apr 15 15:57:11 2016 savecore: saving system crash dump in /export/tmp/crash/{unix,vmcore}.1 Constructing namelist /export/tmp/crash/unix.1 Constructing corefile /export/tmp/crash/vmcore.1 0:02 100% done: 249563 of 249563 pages saved 5814 (2%) zero pages were not written 0:02 dump decompress is done
And tried to analyze with debugger...
$ mdb -k unix.1 vmcore.1 mdb: warning: dump is from SunOS 5.11 illumos-a526879; dcmds and macros may not match kernel implementation Loading modules: [ unix genunix specfs dtrace mac cpu.generic uppc apix scsi_vhci zfs sata sd ip hook neti sockfs arp usba uhci fctl stmf stmf_sbd idm cpc crypto md fcip fcp random lofs ufs logindmux nsmb ptm smbsrv nfs sppp ipc ] > ::status debugging crash dump vmcore.1 (64-bit) from solarix operating system: 5.11 illumos-a526879 (i86pc) image uuid: e701b777-689f-6be0-952a-9e382a0c23e5 panic message: BAD TRAP: type=e (#pf Page fault) rp=ffffff0012579940 addr=fffffd7ffb2ff660 occurred in module "<unknown>" due to an illegal access to a user address dump content: kernel pages only > ::stack 0xfffffffff8d0a225() 0xfffffffff8d08bf8() 0xfffffffff8d933d7() 0xfffffffff8d091ea() supdrvIOCtl+0x17fd() VBoxDrvSolarisIOCtl+0x36c() cdev_ioctl+0x39(11f00000002, c0185687, fffffd7ffcdfed10, 202003, ffffff02f36321c8, ffffff0012579ea8) spec_ioctl+0x60(ffffff02f6798940, c0185687, fffffd7ffcdfed10, 202003, ffffff02f36321c8, ffffff0012579ea8) fop_ioctl+0x55(ffffff02f6798940, c0185687, fffffd7ffcdfed10, 202003, ffffff02f36321c8, ffffff0012579ea8) ioctl+0x9b(11, ffffffffc0185687, fffffd7ffcdfed10) sys_syscall+0x17a() > ::msgbuf MESSAGE NOTICE: dbus-daemon[2319]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c NOTICE: dbus-daemon[2299]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c NOTICE: dbus-daemon[2319]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c NOTICE: dbus-daemon[2299]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c NOTICE: dbus-daemon[2319]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c NOTICE: dbus-daemon[2299]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c NOTICE: dbus-daemon[2299]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c NOTICE: metacity[2314]: missing privilege "proc_owner" (euid = 50, syscall = 5) needed at secpolicy_proc_access+0x20 NOTICE: metacity[2314]: missing privilege "proc_owner" (euid = 50, syscall = 5) needed at secpolicy_proc_access+0x20 NOTICE: dbus-daemon[2299]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c NOTICE: dbus-daemon[2299]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c NOTICE: metacity[2314]: missing privilege "proc_owner" (euid = 50, syscall = 5) needed at secpolicy_proc_access+0x20 NOTICE: metacity[2314]: missing privilege "proc_owner" (euid = 50, syscall = 5) needed at secpolicy_proc_access+0x20 NOTICE: dbus-daemon[2299]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c NOTICE: e1000g0 unregistered pseudo-device: devinfo0 devinfo0 is /pseudo/devinfo@0 NOTICE: dbus-daemon[2299]: missing privilege "proc_audit" (euid = 50, syscall = 186) needed at secpolicy_audit_getattr+0x4c NOTICE: Xorg[2073]: missing privilege "ALL" (euid = 2903, syscall = 6) needed at zfs_zaccess+0x169 NOTICE: Xorg[2073]: missing privilege "ALL" (euid = 2903, syscall = 6) needed at zfs_zaccess+0x169 NOTICE: Xorg[2073]: missing privilege "file_owner" (euid = 2903, syscall = 6) for "modify file times" needed at secpolicy_vnode_utime_modify+0x24 NOTICE: Xorg[2073]: missing privilege "file_dac_search" (euid = 2903, syscall = 88) needed at zfs_zaccess+0x169 NOTICE: e1000g0 registered NOTICE: softmac1000 unregistered cpu_t::cpu_runrun @ 0xd8 (216) cpu_t::cpu_kprunrun @ 0xd9 (217) kthread_t::t_preempt @ 0x2a (42) kthread_t::t_did @ 0x110 (272) kthread_t::t_intr @ 0x100 (256) kthread_t::t_lockp @ 0x1f8 (504) kthread_t::t_procp @ 0x190 (400) tsc::mode Invariant @ tentative 3191770797 Hz pseudo-device: vboxdrv0 vboxdrv0 is /pseudo/vboxdrv@0 vboxnet0: vboxnet: type "ether" mac address 08:00:27:7e:95:f2 pseudo-device: vboxnet0 vboxnet0 is /pseudo/vboxnet@0 pseudo-device: vboxnet0 vboxnet0 is /pseudo/vboxnet@0 NOTICE: softmac1007 registered NOTICE: softmac1007 link up, 1000 Mbps, full duplex pseudo-device: vboxflt0 vboxflt0 is /pseudo/vboxflt@0 pseudo-device: vboxusbmon0 vboxusbmon0 is /pseudo/vboxusbmon@0 ISA-device: ecpp0 ecpp0 is /pci@0,0/isa@1f/lp@1,378 ISA-device: asy0 asy0 is /pci@0,0/isa@1f/asy@1,3f8 pseudo-device: dcpc0 dcpc0 is /pseudo/dcpc@0 pseudo-device: fbt0 fbt0 is /pseudo/fbt@0 pseudo-device: fcp0 fcp0 is /pseudo/fcp@0 pseudo-device: fcsm0 fcsm0 is /pseudo/fcsm@0 pseudo-device: fct0 fct0 is /pseudo/fct@0 pseudo-device: llc10 llc10 is /pseudo/llc1@0 pseudo-device: lockstat0 lockstat0 is /pseudo/lockstat@0 pseudo-device: lofi0 lofi0 is /pseudo/lofi@0 pseudo-device: profile0 profile0 is /pseudo/profile@0 pseudo-device: ramdisk1024 ramdisk1024 is /pseudo/ramdisk@1024 pseudo-device: sdt0 sdt0 is /pseudo/sdt@0 pseudo-device: stmf0 stmf0 is /pseudo/stmf@0 pseudo-device: systrace0 systrace0 is /pseudo/systrace@0 pseudo-device: ucode0 ucode0 is /pseudo/ucode@0 pseudo-device: bpf0 bpf0 is /pseudo/bpf@0 pseudo-device: fssnap0 fssnap0 is /pseudo/fssnap@0 IP Filter: v4.1.9, running. pseudo-device: nsmb0 nsmb0 is /pseudo/nsmb@0 pseudo-device: pm0 pm0 is /pseudo/pm@0 pseudo-device: winlock0 winlock0 is /pseudo/winlock@0 pseudo-device: signalfd0 signalfd0 is /pseudo/signalfd@0 pseudo-device: timerfd0 timerfd0 is /pseudo/timerfd@0 vboxdrv: fffffffff8ce6020 VMMR0.r0 vboxdrv: fffffffff848a020 VBoxDDR0.r0 vboxdrv: fffffffff8a86020 VBoxDD2R0.r0 panic[cpu0]/thread=ffffff02d5860040: BAD TRAP: type=e (#pf Page fault) rp=ffffff0012579940 addr=fffffd7ffb2ff660 occurred in module "<unknown>" due to an illegal access to a user address VirtualBox: #pf Page fault Bad kernel fault at addr=0xfffffd7ffb2ff660 pid=3391, pc=0xfffffffff8d0a225, sp=0xffffff0012579a30, eflags=0x10202 cr0: 80050033<pg,wp,ne,et,mp,pe> cr4: 3406f8<smap,smep,osxsav,xmme,fxsr,pge,mce,pae,pse,de> cr2: fffffd7ffb2ff660 cr3: 18dc04000 cr8: 0 rdi: fffffd7ffb2ff320 rsi: 1 rdx: 34 rcx: ffffff0012579a4f r8: 340 r9: 34 rax: 1a rbx: 0 rbp: ffffff0012579a60 r10: 0 r11: 1a r12: ffffff02fc11f000 r13: ffffff02fc11f000 r14: 0 r15: 0 fsb: fffffd7fff106a40 gsb: fffffffffbc32620 ds: 4b es: 4b fs: 0 gs: 0 trp: e err: 1 rip: fffffffff8d0a225 cs: 30 rfl: 10202 rsp: ffffff0012579a30 ss: 38 ffffff0012579820 unix:die+df () ffffff0012579930 unix:trap+1490 () ffffff0012579940 unix:_cmntrap+e6 () ffffff0012579a60 fffffffff8d0a225 () ffffff0012579ab0 fffffffff8d08bf8 () ffffff0012579ad0 fffffffff8d933d7 () ffffff0012579b10 fffffffff8d091ea () ffffff0012579c10 vboxdrv:supdrvIOCtl+17fd () ffffff0012579cc0 vboxdrv:VBoxDrvSolarisIOCtl+36c () ffffff0012579d00 genunix:cdev_ioctl+39 () ffffff0012579d50 specfs:spec_ioctl+60 () ffffff0012579de0 genunix:fop_ioctl+55 () ffffff0012579f00 genunix:ioctl+9b () ffffff0012579f10 unix:brand_sys_syscall+1f5 () syncing file systems... done dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel NOTICE: ahci0: ahci_tran_reset_dport port 0 reset port > ::cpuinfo -v ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 0 fffffffffbc3cf20 1b 0 0 59 no no t-1 ffffff02d5860040 VirtualBox | RUNNING <--+ READY EXISTS ENABLE ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 1 ffffff02d4b3b580 1f 0 0 -1 no no t-0 ffffff000f4f2c40 (idle) | RUNNING <--+ READY QUIESCED EXISTS ENABLE ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 2 ffffff02d4b35500 1f 0 0 -1 no no t-0 ffffff000f61ec40 (idle) | RUNNING <--+ READY QUIESCED EXISTS ENABLE ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 3 ffffff02d4b34000 1f 1 0 -1 no no t-0 ffffff000f6dcc40 (idle) | | RUNNING <--+ +--> PRI THREAD PROC READY 60 ffffff00104e7c40 sched QUIESCED EXISTS ENABLE > ::panicinfo cpu 0 thread ffffff02d5860040 message BAD TRAP: type=e (#pf Page fault) rp=ffffff0012579940 addr=fffffd7ffb2ff660 occurred in module "<unknown>" due to an illegal access to a user address rdi fffffd7ffb2ff320 rsi 1 rdx 34 rcx ffffff0012579a4f r8 340 r9 34 rax 1a rbx 0 rbp ffffff0012579a60 r10 0 r11 1a r12 ffffff02fc11f000 r13 ffffff02fc11f000 r14 0 r15 0 fsbase fffffd7fff106a40 gsbase fffffffffbc32620 ds 4b es 4b fs 0 gs 0 trapno e err 1 rip fffffffff8d0a225 cs 30 rflags 10202 rsp ffffff0012579a30 ss 38 gdt_hi 0 gdt_lo e00001ef idt_hi 0 idt_lo d0000fff ldt 0 task 70 cr0 80050033 cr2 fffffd7ffb2ff660 cr3 18dc04000 cr4 3406f8
System is brand new one (I had to add extra Graphic, Network and USB 2.0 PCIe cards to have OI working). Here extracts from 'dmidecode' utility:
BIOS Information Vendor: American Megatrends Inc. Version: P1.20 Release Date: 09/03/2015 Characteristics: PCI is supported BIOS is upgradeable BIOS shadowing is allowed Boot from CD is supported Selectable boot is supported BIOS ROM is socketed EDD is supported 5.25"/1.2 MB floppy services are supported (int 13h) 3.5"/720 kB floppy services are supported (int 13h) 3.5"/2.88 MB floppy services are supported (int 13h) Print screen service is supported (int 5h) 8042 keyboard services are supported (int 9h) Serial services are supported (int 14h) Printer services are supported (int 17h) ACPI is supported USB legacy is supported BIOS boot specification is supported Targeted content distribution is supported UEFI is supported Base Board Information Manufacturer: ASRock Product Name: B150M Pro4S Processor Information Socket Designation: CPUSocket Type: Central Processor Family: Core i5 Manufacturer: Intel(R) Corporation ID: E3 06 05 00 FF FB EB BF Signature: Type 0, Family 6, Model 94, Stepping 3 Flags: FPU (Floating-point unit on-chip) VME (Virtual mode extension) DE (Debugging extension) PSE (Page size extension) TSC (Time stamp counter) MSR (Model specific registers) PAE (Physical address extension) MCE (Machine check exception) CX8 (CMPXCHG8 instruction supported) APIC (On-chip APIC hardware supported) SEP (Fast system call) MTRR (Memory type range registers) PGE (Page global enable) MCA (Machine check architecture) CMOV (Conditional move instruction supported) PAT (Page attribute table) PSE-36 (36-bit page size extension) CLFSH (CLFLUSH instruction supported) DS (Debug store) ACPI (ACPI supported) MMX (MMX technology supported) FXSR (FXSAVE and FXSTOR instructions supported) SSE (Streaming SIMD extensions) SSE2 (Streaming SIMD extensions 2) SS (Self-snoop) HTT (Multi-threading) TM (Thermal monitor supported) PBE (Pending break enabled) Version: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz Memory Device Array Handle: 0x0010 Error Information Handle: Not Provided Total Width: 64 bits Data Width: 64 bits Size: 8192 MB Form Factor: DIMM Set: None Locator: ChannelA-DIMM1 Bank Locator: BANK 1 Type: DDR4 Type Detail: Synchronous Speed: 2133 MHz Manufacturer: 1315
OI /hipster is installed to SSD, whilst HD is used for user HOME directories:
AVAILABLE DISK SELECTIONS: 0. c2t0d0 <Samsung-SSD 850 EVO 250GB-EMT01B6Q cyl 28078 alt 2 hd 224 sec 56> /pci@0,0/pci1849,a102@17/disk@0,0 1. c2t1d0 <ATA-ST1000DX001-1NS1-CC41 cyl 60797 alt 2 hd 255 sec 126> /pci@0,0/pci1849,a102@17/disk@1,0
I have attached screenshot from "device driver utility"....
I know that this HW is not covered by any HCL, but as I want to stress: it has worked like charm (very fast and supporting several VMs at the same time) UNTIL some changes have been introduced to recent kernel versions or OI structure.
To me it is obvious that some changes makes crash when VBguest tries to load module (most suspisious if appearance of e1000g which is not showed in current kernel version... This is from currently running (and well operation) version:
$ scanpci pci bus 0x0000 cardnum 0x00 function 0x00: vendor 0x8086 device 0x191f Intel Corporation Sky Lake Host Bridge/DRAM Registers pci bus 0x0000 cardnum 0x01 function 0x00: vendor 0x8086 device 0x1901 Intel Corporation Sky Lake PCIe Controller (x16) pci bus 0x0001 cardnum 0x00 function 0x00: vendor 0x10de device 0x0605 NVIDIA Corporation G92 [GeForce 9800 GT] pci bus 0x0000 cardnum 0x14 function 0x00: vendor 0x8086 device 0xa12f Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller pci bus 0x0000 cardnum 0x14 function 0x02: vendor 0x8086 device 0xa131 Intel Corporation Sunrise Point-H Thermal subsystem pci bus 0x0000 cardnum 0x16 function 0x00: vendor 0x8086 device 0xa13a Intel Corporation Sunrise Point-H CSME HECI #1 pci bus 0x0000 cardnum 0x17 function 0x00: vendor 0x8086 device 0xa102 Intel Corporation Device unknown pci bus 0x0000 cardnum 0x1c function 0x00: vendor 0x8086 device 0xa114 Intel Corporation Sunrise Point-H PCI Express Root Port #5 pci bus 0x0002 cardnum 0x00 function 0x00: vendor 0x12d8 device 0xe111 Pericom Semiconductor PI7C9X111SL PCIe-to-PCI Reversible Bridge pci bus 0x0003 cardnum 0x04 function 0x00: vendor 0x1106 device 0x3038 VIA Technologies, Inc. VT82xx/62xx UHCI USB 1.1 Controller pci bus 0x0003 cardnum 0x04 function 0x01: vendor 0x1106 device 0x3038 VIA Technologies, Inc. VT82xx/62xx UHCI USB 1.1 Controller pci bus 0x0003 cardnum 0x04 function 0x02: vendor 0x1106 device 0x3104 VIA Technologies, Inc. USB 2.0 pci bus 0x0000 cardnum 0x1d function 0x00: vendor 0x8086 device 0xa118 Intel Corporation Sunrise Point-H PCI Express Root Port #9 pci bus 0x0000 cardnum 0x1d function 0x03: vendor 0x8086 device 0xa11b Intel Corporation Sunrise Point-H PCI Express Root Port #12 pci bus 0x0005 cardnum 0x00 function 0x00: vendor 0x10ec device 0x8168 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller pci bus 0x0000 cardnum 0x1f function 0x00: vendor 0x8086 device 0xa148 Intel Corporation Sunrise Point-H LPC Controller pci bus 0x0000 cardnum 0x1f function 0x02: vendor 0x8086 device 0xa121 Intel Corporation Sunrise Point-H PMC pci bus 0x0000 cardnum 0x1f function 0x03: vendor 0x8086 device 0xa170 Intel Corporation Sunrise Point-H HD Audio pci bus 0x0000 cardnum 0x1f function 0x04: vendor 0x8086 device 0xa123 Intel Corporation Sunrise Point-H SMBus pci bus 0x0000 cardnum 0x1f function 0x06: vendor 0x8086 device 0x15b8 Intel Corporation Ethernet Connection (2) I219-V
Not sure what else I can paste here...
Regards.
Files
Related issues
Updated by Nikola M. about 6 years ago
VirtualBox people says it looks like 'Supervisor Mode Access Prevention' (SMAP) problem. (https://en.wikipedia.org/wiki/Supervisor_Mode_Access_Prevention), VirtualBox disables SMAP only for Solaris builds (SMAP support works on Linux, OSX) , because Oracle Solaris disables SMAP.
So aether illumos after illumos-a526879 needs an option to disable SMAP for VirtualBox compatibility
or VirtualBox needs to handle SMAP on Solaris build that runs on illumos and VirtualBox doesn't have illumos-specific build.
Proposed workaround from tsoome:
disable_smap/W1 in mdb -k , disable_smap is int, so /W should be ok (to write 4byte value)
Please try it and report if it worked, so it is know if it is truly SMAP issue
Updated by Predrag Zečević about 6 years ago
It was working (and still does) in older kernel...
What has been changed?
I will give it a try during day and report here (will also use new /hipster repository).
Engaging google, found that this might be proper command:
echo disable_smap/W1 | mdb -kw
Thanks for investigation and update.
Regards.
Updated by Predrag Zečević about 6 years ago
So, tried - but no luck!
Booted into:
[2016-04-19 15:56:35] solarix genunix: [ID 540533 kern.notice] #015SunOS Release 5.11 Version illumos-380fd67 64-bit
Disabled SMAP:
$ echo disable_smap/W1 | pfexec mdb -kw disable_smap: 0 = 0x1
Started vB guest:
[2016-04-19 15:56:39] solarix savecore: [ID 188441 auth.error] Decompress the crash dump with #012'savecore -vf /export/tmp/crash/vmdump.5' [2016-04-19 15:56:39] solarix fmd: [ID 377184 daemon.error] SUNW-MSG-ID: SUNOS-8000-KL, TYPE: Defect, VER: 1, SEVERITY: Major#012EVENT-TIME: Tue Apr 19 15:56:39 CEST 2016#012PLATFORM: To-Be-Filled-By-O.E.M., CSN: -, HOSTNAME: solarix#012SOURCE: software-diagnosis, REV: 0.1#012EVENT-ID: 8fc710d4-175b-c682-accb-b865aaaf19dd#012DESC: The system has rebooted after a kernel panic. Refer to http://illumos.org/msg/SUNOS-8000-KL for more information.#012AUTO-RESPONSE: The failed system image was dumped to the dump device. If savecore is enabled (see dumpadm(1M)) a copy of the dump will be written to the savecore directory /export/tmp/crash.#012IMPACT: There may be some performance impact while the panic is copied to the savecore directory. Disk space usage by panics can be substantial.#012REC-ACTION: If savecore is not enabled then please take steps to preserve the crash image.#012Use 'fmdump -Vp -u 8fc710d4-175b-c682-accb-b865aaaf19dd' to view more panic detail. Please refer to the knowledge article for additional information.
So, identical problem...
Regards.
Updated by Robert Mustacchi about 6 years ago
Using mdb that way to toggle SMAP is far too late. You'll want to put it in /etc/system and reboot.
However, to make sure that I understand what's going on, are you trying to run virtual box on an illumos system with illumos as a host or are you running illumos inside of virtualbox?
Updated by Robert Mustacchi about 6 years ago
Also, in those addresses in the panic, what does whatis say they are? Where's the source code to this vbox driver in question?
Updated by Predrag Zečević about 6 years ago
Hi Robert,
ilumos (OI /hipster) in version from 29.03.2016 works and OS updates later, causes that any type of VB guest (I use: Win, Linux) crashes host...
BTW. What would be /etc/system syntax for disable_smap then?
Thanks and regards.
Updated by Nikola M. about 6 years ago
Please try putting in /etc/system:
set disable_smap=1
and see now it is doing with newer illumos.
Updated by Predrag Zečević about 6 years ago
Bingo!
that has definitelly helped!
Let me try to use system for a while, before closing ticket.
Many thanks :-)
Regards.
Updated by Robert Mustacchi about 6 years ago
It'd still be useful to get the ::whatis output on those addresses so we can confirm that the issue is that the vbox kernel module isn't using DDI based methods to try and access user data.
Updated by Predrag Zečević about 6 years ago
I am not expert for mdb, but here some data:
$ mdb -k unix.5 vmcore.5 Loading modules: [ unix genunix specfs dtrace mac cpu.generic uppc apix scsi_vhci zfs sata sd ip hook neti sockfs arp usba uhci fctl stmf stmf_sbd mm idm cpc crypto md fcip fcp random lofs ufs logindmux nsmb ptm smbsrv nfs sppp ipc ] > ::cpuinfo -v ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 0 fffffffffbc32620 1f 0 0 -1 no no t-0 ffffff000f205c40 (idle) | RUNNING <--+ READY QUIESCED EXISTS ENABLE ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 1 fffffffffbc3cf20 1b 0 0 59 no no t-1 ffffff02df0f0440 VirtualBox | RUNNING <--+ READY EXISTS ENABLE ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 2 ffffff02d791f080 1f 0 0 -1 no no t-0 ffffff000f64bc40 (idle) | RUNNING <--+ READY QUIESCED EXISTS ENABLE ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 3 ffffff02d7a72540 1f 1 0 -1 no no t-0 ffffff000f8d7c40 (idle) | | RUNNING <--+ +--> PRI THREAD PROC READY 60 ffffff000f313c40 sched QUIESCED EXISTS ENABLE > fffffffffbc3cf20::whatis fffffffffbc3cf20 is panic_cpu, in unix's data segment > ffffff000f64bc40::whatis ffffff000f64bc40 is allocated as a thread structure > ::panicinfo cpu 1 thread ffffff02df0f0440 message BAD TRAP: type=e (#pf Page fault) rp=ffffff00127ff940 addr=fffffd7ffcbaf660 occurred in module "<unknown>" due to an illegal access to a user address rdi fffffd7ffcbaf320 rsi 1 rdx 34 rcx ffffff00127ffa4f r8 340 r9 34 rax 1a rbx 0 rbp ffffff00127ffa60 r10 0 r11 1a r12 ffffff030534b000 r13 ffffff030534b000 r14 0 r15 0 fsbase fffffd7fff106a40 gsbase ffffff02d7920580 ds 4b es 4b fs 0 gs 0 trapno e err 1 rip fffffffff8cdd225 cs 30 rflags 10202 rsp ffffff00127ffa30 ss 38 gdt_hi 0 gdt_lo 700001ef idt_hi 0 idt_lo d0000fff ldt 0 task 70 cr0 80050033 cr2 fffffd7ffcbaf660 cr3 188e4e000 cr4 3406f8 > ffffff02df0f0440::whatis ffffff02df0f0440 is allocated as a thread structure > fffffd7ffcbaf660::whatis fffffd7ffcbaf660 is unknown >
If there is more I can do (paste) let me know (together with instructions) what exactly is important here.
Thanks an regards.
Updated by Predrag Zečević about 6 years ago
Hi all,
disabling SMAP did a job (look above for required settting in /etc/system file)
Tested for few days, with one or more VB guests - no problems so far.
Many thanks to everbody involved in solving this problem.
With best regards.
Updated by Nikola M. about 6 years ago
Added on-wiki VirtualBox instructions:
http://wiki.openindiana.org/oi/7.2+VirtualBox
('set disable_smap=1' in /etc/system)
Updated by Aurélien Larcher over 4 years ago
- Status changed from New to Resolved
Updated by Gergő Mihály Doma over 3 years ago
- Has duplicate Bug #10602: Virtualbox crashes the system added