Project

General

Profile

Actions

Bug #14397

open

page faults in guests under bhyve

Added by Jorge Schrauwen 8 days ago. Updated about 19 hours ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
bhyve
Start date:
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

Since late early december (not super sure on the exact time it started) I've been having my linux guests randomly die due to page faults.

Other people on SmartOS are also experiencing this https://smartos.topicbox.com/groups/smartos-discuss/Tb946ba904d4e6e44/bhyv-vms-stuck-in-shuttingdown-state, although in my case I am using OmniOS.

Linux (from mailling list)

[11491418.760960] BUG: kernel NULL pointer dereference, address: 
0000000000000008
[11491418.763086] #PF: supervisor write access in kernel mode
[11491418.764347] #PF: error_code(0x0002) - not-present page
[11491418.765512] PGD 24b336067 P4D 24b336067 PUD 46bc94067 PMD 0
[11491418.766937] Oops: 0002 [#4] SMP NOPTI
[11491418.767897] CPU: 15 PID: 877465 Comm: go Tainted: P      D W  
O      5.4.0-80-generic #90-Ubuntu
[11491418.769746] Hardware name: Joyent SmartDC HVM, BIOS 13.0 11/10/2020
[11491418.771116] RIP: 0010:avl_rotation.isra.0+0x211/0x240 [zavl]
[11491418.772380] Code: 83 c8 01 49 89 43 10 48 85 db 75 27 48 8b 45 d0 
4c 89 18 48 83 c4 10 b8 01 00 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 48 
63 ff <4c> 89 14 fb e9 e5 fe ff ff 48 63 ff b8 01 00 00 00 4c 89 1c fb 48
[11491418.776053] RSP: 0018:ffffbb79ef0afb70 EFLAGS: 00010202
[11491418.777241] RAX: 000000000000000a RBX: 0000000000000008 RCX: 
0000000000000000
[11491418.778795] RDX: 0000000000000001 RSI: ffff98d5b0ea63d0 RDI: 
0000000000000000
[11491418.780329] RBP: ffffbb79ef0afba8 R08: 0000000000000000 R09: 
0000000000000008
[11491418.781883] R10: ffff98d691a65c08 R11: 0000000000000000 R12: 
ffff98d691a65c10
[11491418.783443] R13: 0000000000000000 R14: 00000000ffffffff R15: 
ffff98d5b0ea63d0
[11491418.784982] FS:  00007fbfe8ff9700(0000) GS:ffff98d89fbc0000(0000) 
knlGS:0000000000000000
[11491418.786693] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11491418.788025] CR2: 0000000000000008 CR3: 0000000147718000 CR4: 
00000000000406e0
[11491418.789580] Call Trace:
[11491418.790310]  avl_insert+0xa9/0xb0 [zavl]
[11491418.791377]  zfs_rangelock_add_reader+0x14c/0x1e0 [zfs]
[11491418.792617]  zfs_rangelock_enter_reader+0xf1/0x1c0 [zfs]
[11491418.793884]  zfs_rangelock_enter+0xed/0xf0 [zfs]
[11491418.795038]  zfs_get_data+0x158/0x350 [zfs]
[11491418.796099]  zil_lwb_commit+0x1c6/0x360 [zfs]
[11491418.797182]  zil_process_commit_list+0xf1/0x210 [zfs]
[11491418.798400]  zil_commit_writer.isra.0+0xa3/0xb0 [zfs]
[11491418.799590]  zil_commit_impl+0x59/0xa0 [zfs]
[11491418.800682]  zil_commit+0x40/0x60 [zfs]
[11491418.801669]  zfs_fsync+0x7a/0xe0 [zfs]
[11491418.802767]  zpl_fsync+0x5c/0x90 [zfs]
[11491418.803734]  vfs_fsync_range+0x49/0x80
[11491418.804673]  ? __fget_light+0x57/0x70
[11491418.805584]  do_fsync+0x3d/0x70
[11491418.806420]  __x64_sys_fsync+0x14/0x20
[11491418.807371]  do_syscall_64+0x57/0x190
[11491418.808311]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[11491418.809465] RIP: 0033:0x4bc57b

Linux (one of mine, also posted to the mailinglist)

[ 7623.474478] BUG: kernel NULL pointer dereference, address: 
0000000000000000
[ 7623.478578] #PF: supervisor instruction fetch in kernel mode
[ 7623.482018] #PF: error_code(0x0010) - not-present page
[ 7623.485138] PGD 8000000109949067 P4D 8000000109949067 PUD 109948067 
PMD 0
[ 7623.489300] Oops: 0010 [#11] SMP PTI
[ 7623.490705] CPU: 3 PID: 41336 Comm: kworker/3:4 Tainted: G      D W   
       5.13.0-22-generic #22-Ubuntu
[ 7623.493774] Hardware name: OmniOS OmniOS HVM/BHYVE, BIOS 13.0 
11/10/2020
[ 7623.496202] RIP: 0010:0x0
[ 7623.497136] Code: Unable to access opcode bytes at RIP 
0xffffffffffffffd6.
[ 7623.499486] RSP: 0018:ffffa92f84bc3e48 EFLAGS: 00010092
[ 7623.501385] RAX: ffff90da43a5b080 RBX: 905ff275dd553c32 RCX: 
0000000000000000
[ 7623.503741] RDX: 0000000000000000 RSI: 0000000055555554 RDI: 
0000000000000000
[ 7623.506159] RBP: 00000014b916f589 R08: 0000000000000001 R09: 
0000000000000000
[ 7623.508660] R10: 0000000000000000 R11: 0000000000000003 R12: 
c43e981979a2f47c
[ 7623.511094] R13: 6adb42ae1e9be9ba R14: 5d46cf7c6a592ba1 R15: 
aa8eed83abe70eac
[ 7623.513466] FS:  0000000000000000(0000) GS:ffff90da7bd80000(0000) 
knlGS:0000000000000000
[ 7623.516130] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7623.518040] CR2: ffffffffffffffd6 CR3: 000000010a302004 CR4: 
00000000001706e0
[ 7623.520362] Call Trace:
[ 7623.521277] Modules linked in: md4 cmac nls_utf8 cifs libarc4 libdes 
9p fscache netfs binfmt_misc intel_rapl_msr nls_iso8859_1 
intel_rapl_common rapl 9pnet_virtio 9pnet input_leds mac_hid serio_raw 
efi_pstore sch_fq_codel msr drm sunrpc ip_tables x_tables autofs4 btrfs 
blake2b_generic zstd_compress raid10 raid456 async_raid6_recov 
async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 
raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel 
aesni_intel nvme crypto_simd virtio_net net_failover cryptd psmouse 
failover nvme_core
[ 7623.537628] CR2: 0000000000000000
[ 7623.538811] ---[ end trace 3a7ee1faeee192b8 ]---

I've seen it a few other times, but didn't copy the output, but although related it does not seem to be the ZFS module as I am not using/loading it on my VMs.
Until now I've only seen it on my linux VMs (both ubuntu, one LTS, one stable). However today while testing a newer firmware for Andy on the latest bhyve-sync bits I hit something very similar on my FreeBSD vms.

FreeBSD:

BdsDxe: loading Boot0001 "UEFI bhyve-NVMe NVME-4-0" from PciRoot(0x0)/Pci(0x4,0x0)/NVMe(0x1,01-00-84-87-20-FC-9C-58)
BdsDxe: starting Boot0001 "UEFI bhyve-NVMe NVME-4-0" from PciRoot(0x0)/Pci(0x4,0x0)/NVMe(0x1,01-00-84-87-20-FC-9C-58)
Consoles: EFI console
    Reading loader env vars from /efi/freebsd/loader.env
Setting currdev to disk0p1:
FreeBSD/amd64 EFI loader, Revision 1.1

   Command line arguments: loader.efi
   Image base: 0x7e8d7000
   EFI version: 2.70
   EFI Firmware: BHYVE (rev 1.00)
   Console: efi (0x1000)
   Load Path: \EFI\BOOT\BOOTX64.EFI
   Load Device: PciRoot(0x0)/Pci(0x4,0x0)/NVMe(0x1,01-00-84-87-20-FC-9C-58)/HD(1,GPT,92F393EA-5F1C-11EC-AC0F-002590F15508,0x28,0x82000)
   BootCurrent: 0001
   BootOrder: 0000 0001[*] 0002
   BootInfo Path: PciRoot(0x0)/Pci(0x4,0x0)/NVMe(0x1,01-00-84-87-20-FC-9C-58)
Ignoring Boot0001: Only one DP found
Trying ESP: PciRoot(0x0)/Pci(0x4,0x0)/NVMe(0x1,01-00-84-87-20-FC-9C-58)/HD(1,GPT,92F393EA-5F1C-11EC-AC0F-002590F15508,0x28,0x82000)
Setting currdev to disk0p1:
Trying: PciRoot(0x0)/Pci(0x4,0x0)/NVMe(0x1,01-00-84-87-20-FC-9C-58)/HD(2,GPT,930B40B3-5F1C-11EC-AC0F-002590F15508,0x82800,0x800000)
Setting currdev to disk0p2:
Trying: PciRoot(0x0)/Pci(0x4,0x0)/NVMe(0x1,01-00-84-87-20-FC-9C-58)/HD(3,GPT,93204892-5F1C-11EC-AC0F-002590F15508,0x882800,0x297D000)
Setting currdev to zfs:rpool/ROOT/default:
\

Loading /boot/defaults/loader.conf
Loading /boot/defaults/loader.conf
Loading /boot/device.hints
Loading /boot/loader.conf
Loading /boot/loader.conf.local
Loading /boot/loader.conf.d/carp.conf
Loading /boot/loader.conf.d/autoboot.conf
\
?c|
-  ______               ____   _____ _____
  |  ____|             |  _ \ / ____|  __ \
  | |___ _ __ ___  ___ | |_) | (___ | |  | |
  |  ___| '__/ _ \/ _ \|  _ < \___ \| |  | |
  | |   | | |  __/  __/| |_) |____) | |__| |
  | |   | | |    |    ||     |      |      |
  |_|   |_|  \___|\___||____/|_____/|_____/      ```                        `
                                                s` `.....---.......--.```   -/
 ������������Welcome to FreeBSD�������������    +o   .--`         /y:`      +.
 �                                         �     yo`:.            :o      `+-
 �  1. Boot Multi user [Enter]             �      y/               -/`   -o/
 �  2. Boot Single user                    �     .-                  ::/sy+:.
 �  3. Escape to loader prompt             �     /                     `--  /
 �  4. Reboot                              �    `:                          :`
 �  5. Cons: Serial                        �    `:                          :`
 �                                         �     /                          /
 �  Options:                               �     .-                        -.
 �  6. Kernel: default/kernel (1 of 2)     �      --                      -.
 �  7. Boot Options                        �       `:`                  `:`
 �  8. Boot Environments                   �         .--             `--.
 �                                         �            .---.....----.
 �������������������������������������������
   Autoboot in -2 seconds, hit [Enter] to boot or any other key to stop

Loading kernel...
/boot/kernel/kernel text=0x17b9e0 text=0xdd6d60 text=0x65b9dc data=0x140 data=0x1b9348+0x445cb8 syms=[0x8+0x178ea8+0x8+0x19906f]
Loading configured modules...
/boot/kernel/carp.ko size 0x10c28 at 0x2112000
/etc/hostid size=0x25
/boot/kernel/zfs.ko size 0x67feb0 at 0x2123000
/boot/kernel/cryptodev.ko size 0xae38 at 0x27a3000
/boot/entropy size=0x1000
Start @ 0xffffffff8037c000 ...
EFI framebuffer information:
addr, size     0x0, 0x0
dimensions     0 x 0
stride         0
masks          0x00000000, 0x00000000, 0x00000000, 0x00000000
---<<BOOT>>---
Copyright (c) 1992-2021 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
    The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 13.0-RELEASE-p4 #0: Tue Aug 24 07:33:27 UTC 2021
    root@amd64-builder.daemonology.net:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
FreeBSD clang version 11.0.1 (git@github.com:llvm/llvm-project.git llvmorg-11.0.1-0-g43ff75f2c3fe)
VT(vga): text 80x25
CPU: Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz (2101.71-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x306e4  Family=0x6  Model=0x3e  Stepping=4
  Features=0x9f83fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2,SS,HTT,PBE>
  Features2=0xfe9e6217<SSE3,PCLMULQDQ,DTES64,DS_CPL,SSSE3,CX16,xTPR,PCID,DCA,SSE4.1,SSE4.2,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND,HV>
  AMD Features=0x24100800<SYSCALL,NX,Page1GB,LM>
  AMD Features2=0x1<LAHF>
  Structured Extended Features=0x281<FSGSBASE,SMEP,ERMS>
  Structured Extended Features3=0x400<MD_CLEAR>
  XSAVE Features=0x1<XSAVEOPT>
  TSC: P-state invariant
Hypervisor: Origin = "bhyve bhyve " 
real memory  = 2147483648 (2048 MB)
avail memory = 2040242176 (1945 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <BHYVE  BVMADT  >
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s) x 2 hardware threads
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG" 
random: unblocking device.
ioapic0: MADT APIC ID 4 != hw id 0
MADT: Forcing active-low polarity and level trigger for SCI
ioapic0 <Version 1.1> irqs 0-31
Launching APs: 1 2 3
Timecounter "TSC" frequency 2101710372 Hz quality 1000
KTLS: Initialized 4 threads
random: entropy device external interface
kernel trap 12 with interrupts disabled

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address    = 0x0
fault code        = supervisor read data, page not present
instruction pointer    = 0x20:0xffffffff80be8671
stack pointer            = 0x28:0xfffffe005133eb90
frame pointer            = 0x28:0xfffffe005133eba0
code segment        = base 0x0, limit 0xfffff, type 0x1b
            = DPL 0, pres 1, long 1, def32 0, gran 1
rrocessor eflags    = resumke,e rInOPeLl =  0t
 cauprr e1nt2  prwoicetshs         i= n1t3 e(rg_reuvepntt)s
 1draip snaumbblere        d=
  2

panic: page fault
cpuid = 2
time = 1
KDB: stack backtrace:
#0 0xffffffff80c574c5 at kdb_backtrace+0x65
#1 0xffffffff80c09ea1 at vpanic+0x181
#2 0xffffffff80c09d13 at panic+0x43
#3 0xffffffff8108b1b7 at trap_fatal+0x387
#4 0xffffffff8108b20f at trap_pfault+0x4f
#5 0xffffffff8108a86d at trap+0x27d
#6 0xffffffff81061958 at calltrap+0x8
#7 0xffffffff80b4f1e0 at g_event_procbody+0x10
#8 0xffffffff80bc7dde at fork_exit+0x7e
#9 0xffffffff810629de at fork_trampoline+0xe
Uptime: 1s
Automatic reboot in 15 seconds - press a key on the console to abort
Rebooting...
cpu_reset: Restarting BSP
cpu_reset_proxy: Stopped CPU 2

After looping 3 times it finally booted and seems to be working fine (for now). I've seen it a few times this week while testing the new bhyve-sync from Andy, but I do not think they are cause by the sync as the linux stacks from earlier are from before the sync. It's probably because I've been poking my VMs harder and rebooting them a lot that I'm seeing this at a higher frequency.

I think they might be related and cause by the same underlying issue in bhyve.

Actions #1

Updated by Jorge Schrauwen 8 days ago

Here is the one from FreeBSD

BdsDxe: loading Boot0001 "UEFI bhyve-NVMe NVME-4-0" from PciRoot(0x0)/Pci(0x4,0x0)/NVMe(0x1,01-00-84-87-20-FC-9C-58)
BdsDxe: starting Boot0001 "UEFI bhyve-NVMe NVME-4-0" from PciRoot(0x0)/Pci(0x4,0x0)/NVMe(0x1,01-00-84-87-20-FC-9C-58)
Consoles: EFI console
    Reading loader env vars from /efi/freebsd/loader.env
Setting currdev to disk0p1:
FreeBSD/amd64 EFI loader, Revision 1.1

   Command line arguments: loader.efi
   Image base: 0x7e8d7000
   EFI version: 2.70
   EFI Firmware: BHYVE (rev 1.00)
   Console: efi (0x1000)
   Load Path: \EFI\BOOT\BOOTX64.EFI
   Load Device: PciRoot(0x0)/Pci(0x4,0x0)/NVMe(0x1,01-00-84-87-20-FC-9C-58)/HD(1,GPT,92F393EA-5F1C-11EC-AC0F-002590F15508,0x28,0x82000)
   BootCurrent: 0001
   BootOrder: 0000 0001[*] 0002
   BootInfo Path: PciRoot(0x0)/Pci(0x4,0x0)/NVMe(0x1,01-00-84-87-20-FC-9C-58)
Ignoring Boot0001: Only one DP found
Trying ESP: PciRoot(0x0)/Pci(0x4,0x0)/NVMe(0x1,01-00-84-87-20-FC-9C-58)/HD(1,GPT,92F393EA-5F1C-11EC-AC0F-002590F15508,0x28,0x82000)
Setting currdev to disk0p1:
Trying: PciRoot(0x0)/Pci(0x4,0x0)/NVMe(0x1,01-00-84-87-20-FC-9C-58)/HD(2,GPT,930B40B3-5F1C-11EC-AC0F-002590F15508,0x82800,0x800000)
Setting currdev to disk0p2:
Trying: PciRoot(0x0)/Pci(0x4,0x0)/NVMe(0x1,01-00-84-87-20-FC-9C-58)/HD(3,GPT,93204892-5F1C-11EC-AC0F-002590F15508,0x882800,0x297D000)
Setting currdev to zfs:rpool/ROOT/default:
\

Loading /boot/defaults/loader.conf
Loading /boot/defaults/loader.conf
Loading /boot/device.hints
Loading /boot/loader.conf
Loading /boot/loader.conf.local
Loading /boot/loader.conf.d/carp.conf
Loading /boot/loader.conf.d/autoboot.conf
\
?c|
-  ______               ____   _____ _____
  |  ____|             |  _ \ / ____|  __ \
  | |___ _ __ ___  ___ | |_) | (___ | |  | |
  |  ___| '__/ _ \/ _ \|  _ < \___ \| |  | |
  | |   | | |  __/  __/| |_) |____) | |__| |
  | |   | | |    |    ||     |      |      |
  |_|   |_|  \___|\___||____/|_____/|_____/      ```                        `
                                                s` `.....---.......--.```   -/
 ������������Welcome to FreeBSD�������������    +o   .--`         /y:`      +.
 �                                         �     yo`:.            :o      `+-
 �  1. Boot Multi user [Enter]             �      y/               -/`   -o/
 �  2. Boot Single user                    �     .-                  ::/sy+:.
 �  3. Escape to loader prompt             �     /                     `--  /
 �  4. Reboot                              �    `:                          :`
 �  5. Cons: Serial                        �    `:                          :`
 �                                         �     /                          /
 �  Options:                               �     .-                        -.
 �  6. Kernel: default/kernel (1 of 2)     �      --                      -.
 �  7. Boot Options                        �       `:`                  `:`
 �  8. Boot Environments                   �         .--             `--.
 �                                         �            .---.....----.
 �������������������������������������������
   Autoboot in 0 seconds, hit [Enter] to boot or any other key to stop

Loading kernel...
/boot/kernel/kernel text=0x17b9e0 text=0xdd6d60 text=0x65b9dc data=0x140 data=0x1b9348+0x445cb8 syms=[0x8+0x178ea8+0x8+0x19906f]
Loading configured modules...
/boot/kernel/carp.ko size 0x10c28 at 0x2112000
/boot/entropy size=0x1000
/boot/kernel/cryptodev.ko size 0xae38 at 0x2124000
/etc/hostid size=0x25
/boot/kernel/zfs.ko size 0x67feb0 at 0x212f000
Start @ 0xffffffff8037c000 ...
EFI framebuffer information:
addr, size     0x0, 0x0
dimensions     0 x 0
stride         0
masks          0x00000000, 0x00000000, 0x00000000, 0x00000000
---<<BOOT>>---
Copyright (c) 1992-2021 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
    The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 13.0-RELEASE-p4 #0: Tue Aug 24 07:33:27 UTC 2021
    root@amd64-builder.daemonology.net:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
FreeBSD clang version 11.0.1 (git@github.com:llvm/llvm-project.git llvmorg-11.0.1-0-g43ff75f2c3fe)
VT(vga): text 80x25
CPU: Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz (2101.68-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x306e4  Family=0x6  Model=0x3e  Stepping=4
  Features=0x9f83fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2,SS,HTT,PBE>
  Features2=0xfe9e6217<SSE3,PCLMULQDQ,DTES64,DS_CPL,SSSE3,CX16,xTPR,PCID,DCA,SSE4.1,SSE4.2,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND,HV>
  AMD Features=0x24100800<SYSCALL,NX,Page1GB,LM>
  AMD Features2=0x1<LAHF>
  Structured Extended Features=0x281<FSGSBASE,SMEP,ERMS>
  Structured Extended Features3=0x400<MD_CLEAR>
  XSAVE Features=0x1<XSAVEOPT>
  TSC: P-state invariant
Hypervisor: Origin = "bhyve bhyve " 
real memory  = 2147483648 (2048 MB)
avail memory = 2040246272 (1945 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <BHYVE  BVMADT  >
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s) x 2 hardware threads
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG" 
random: unblocking device.
ioapic0: MADT APIC ID 4 != hw id 0
MADT: Forcing active-low polarity and level trigger for SCI
ioapic0 <Version 1.1> irqs 0-31
Launching APs: 3 2 1
Timecounter "TSC" frequency 2101679808 Hz quality 1000
KTLS: Initialized 4 threads
random: entropy device external interface
kernel trap 12 with interrupts disabled

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address    = 0x0
fault code        = supervisor read data, page not present
instruction pointer    = 0x20:0xffffffff80be8671
stack pointer            = 0x28:0xfffffe005133eb90
frame pointer            = 0x28:k0exfrfnffefle0 0t51r3a3epba 01
tyo dwe isetghme nit    n    t= ebrasre u0px0t,s l imdiit s0xafbfflfef,d
  pe
10x
Fb
     a        t=a DlP L t0r, appre s1 12,: l onpga 1g,e d eff3a2u 0l, tg rawn h1i
Ipero ceisns ork eefrlangse    l=  rmesoumdee,
 OPcL p=u 0i
)c ur=re n3t ;pr oacepsis    c    =  i13d ( g=_e v0en3t

tfraapu nlutmb erv    i    =r 1t2u
apanic: page fault
cpuid = 2
time = 1
KDB: stack backtrace:
#0 0xffffffff80c574c5 at kdb_backtrace+0x65
#1 0xffffffff80c09ea1 at vpanic+0x181
#2 0xffffffff80c09d13 at panic+0x43
#3 0xffffffff8108b1b7 at trap_fatal+0x387
#4 0xffffffff8108b20f at trap_pfault+0x4f
#5 0xffffffff8108a86d at trap+0x27d
#6 0xffffffff81061958 at calltrap+0x8
#7 0xffffffff80b4f1e0 at g_event_procbody+0x10
#8 0xffffffff80bc7dde at fork_exit+0x7e
#9 0xffffffff810629de at fork_trampoline+0xe
Uptime: 1s
Automatic reboot in 15 seconds - press a key on the console to abort
Rebooting...
cpu_reset: Restarting BSP
cpu_reset_proxy: Stopped CPU 2

Not sure if relevant but it looks like both FreeBSD and my Linux one are all trying to access address 0x0, just something I noticed.

Actions #3

Updated by Jorge Schrauwen 5 days ago

As requested on IRC, some info on the physical machine:

root@saturn:~# prtdiag
System Configuration: Supermicro X9DRi-LN4+/X9DR3-LN4+
BIOS Configuration: American Megatrends Inc. 3.2 03/04/2015
BMC Configuration: IPMI 2.0 (KCS: Keyboard Controller Style)

==== Processor Sockets ====================================

Version                          Location Tag
-------------------------------- --------------------------
Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz CPU 1
Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz CPU 2

==== Memory Device Sockets ================================

Type        Status Set Device Locator      Bank Locator
----------- ------ --- ------------------- ----------------
DDR3        in use 0   P1-DIMMA1           P0_Node0_Channel0_Dimm0
DDR3        in use 0   P1-DIMMA2           P0_Node0_Channel0_Dimm1
unknown     empty  0   P1-DIMMA3           P0_Node0_Channel0_Dimm2
DDR3        in use 0   P1-DIMMB1           P0_Node0_Channel1_Dimm0
DDR3        in use 0   P1-DIMMB2           P0_Node0_Channel1_Dimm1
unknown     empty  0   P1-DIMMB3           P0_Node0_Channel1_Dimm2
unknown     empty  0   P1-DIMMC1           P0_Node0_Channel2_Dimm0
unknown     empty  0   P1-DIMMC2           P0_Node0_Channel2_Dimm1
unknown     empty  0   P1-DIMMC3           P0_Node0_Channel2_Dimm2
unknown     empty  0   P1-DIMMD1           P0_Node0_Channel3_Dimm0
unknown     empty  0   P1-DIMMD2           P0_Node0_Channel3_Dimm1
unknown     empty  0   P1-DIMMD3           P0_Node0_Channel3_Dimm2
DDR3        in use 0   P2-DIMME1           P1_Node1_Channel0_Dimm0
DDR3        in use 0   P2-DIMME2           P1_Node1_Channel0_Dimm1
unknown     empty  0   P2-DIMME3           P1_Node1_Channel0_Dimm2
DDR3        in use 0   P2-DIMMF1           P1_Node1_Channel1_Dimm0
DDR3        in use 0   P2-DIMMF2           P1_Node1_Channel1_Dimm1
unknown     empty  0   P2-DIMMF3           P1_Node1_Channel1_Dimm2
unknown     empty  0   P2-DIMMG1           P1_Node1_Channel2_Dimm0
unknown     empty  0   P2-DIMMG2           P1_Node1_Channel2_Dimm1
unknown     empty  0   P2-DIMMG3           P1_Node1_Channel2_Dimm2
unknown     empty  0   P2-DIMMH1           P1_Node1_Channel3_Dimm0
unknown     empty  0   P2-DIMMH2           P1_Node1_Channel3_Dimm1
unknown     empty  0   P2-DIMMH3           P1_Node1_Channel3_Dimm2

==== On-Board Devices =====================================
 Onboard Matrox VGA
 Onboard Intel Ethernet 1
 Onboard Intel Ethernet 2
 Onboard Intel Ethernet 3
 Onboard Intel Ethernet 4

==== Upgradeable Slots ====================================

ID  Status    Type             Description
--- --------- ---------------- ----------------------------
1   available PCI Exp. Gen 3 x16 CPU1 SLOT1 PCI-E 3.0 X16
2   in use    PCI Exp. Gen 3 x4 CPU1 SLOT2 PCI-E 3.0 X4 (IN X8 SLOT), pci12d8,8608 (pcieb)
3   available PCI Exp. Gen 3 x16 CPU1 SLOT3 PCI-E 3.0 X16
4   in use    PCI Exp. Gen 3 x16 CPU2 SLOT4 PCI-E 3.0 X16, Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (mpt_sas)
5   in use    PCI Exp. Gen 3 x16 CPU2 SLOT5 PCI-E 3.0 X16, Broadcom / LSI SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor] (mpt_sas)
6   in use    PCI Exp. Gen 3 x8 CPU2 SLOT6 PCI-E 3.0 X8, Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (ixgbe)
root@saturn:~# prtconf -m
131044
root@saturn:~# pptadm list
DEV        VENDOR DEVICE PATH
/dev/ppt0  1912   15     /pci@0,0/pci8086,e03@1,1/pci12d8,8608@0/pci12d8,8608@1/pci1912,15@0
/dev/ppt1  1912   15     /pci@0,0/pci8086,e03@1,1/pci12d8,8608@0/pci12d8,8608@5/pci1912,15@0
/dev/ppt2  1912   15     /pci@0,0/pci8086,e03@1,1/pci12d8,8608@0/pci12d8,8608@7/pci1912,15@0
/dev/ppt3  1912   15     /pci@0,0/pci8086,e03@1,1/pci12d8,8608@0/pci12d8,8608@9/pci1912,15@0
/dev/ppt4  8086   1521   /pci@0,0/pci8086,1d10@1c/pci15d9,1521@0
/dev/ppt5  8086   1521   /pci@0,0/pci8086,1d10@1c/pci15d9,1521@0,1
/dev/ppt6  8086   1521   /pci@0,0/pci8086,1d10@1c/pci15d9,1521@0,2
/dev/ppt7  8086   1521   /pci@0,0/pci8086,1d10@1c/pci15d9,1521@0,3
root@saturn:~#

So it's an Ivy Bridge EP with 128G of memory, nothing too fancy. I have a PCIe card with 4 xhci controller I use for passthru as well as 4x onboard igb.

As for the VM it's mostly ares that is hitting it (unrar/par2 workloads or ffmpeg transcoding), artemis (ffmpeg transcoding), very rarely proteus or rcon2 during boot.

ares: Ubuntu 21.10 (GNU/Linux 5.13.0-25-generic x86_64)
artemis: Ubuntu 20.04.3 LTS (GNU/Linux 5.4.0-94-generic x86_64)
proteus: FreeBSD 13.0-RELEASE-p6
rcon2: FreeBSD 13.0-RELEASE-p6
root@saturn:~# zadm show ares
{
   "acpi" : "on",
   "autoboot" : "true",
   "bootargs" : "",
   "bootdisk" : {
      "blocksize" : "4K",
      "path" : "rpool/vms/ares/disk0",
      "size" : "25G",
      "sparse" : "false" 
   },
   "bootorder" : "cd",
   "bootrom" : "BHYVE_RELEASE-202111",
   "brand" : "bhyve",
   "cloud-init" : "off",
   "cpu-shares" : "30",
   "dataset" : [
      {
         "name" : "rpool/vmdata/ares" 
      }
   ],
   "diskif" : "nvme",
   "fs-allowed" : "",
   "hostbridge" : "i440fx",
   "hostid" : "",
   "ip-type" : "exclusive",
   "limitpriv" : "default",
   "net" : [
      {
         "global-nic" : "ixgbe0",
         "mac-addr" : "00:22:06:b5:f5:2d",
         "physical" : "ares0",
         "vlan-id" : "100",
         "vqsize" : "32768" 
      }
   ],
   "netif" : "virtio",
   "pool" : "",
   "ram" : "4G",
   "rng" : "off",
   "scheduling-class" : "",
   "type" : "generic",
   "vcpus" : "4,sockets=1,cores=2,threads=2",
   "vga" : "off",
   "virtfs" : [
      {
         "name" : "wwwdata",
         "path" : "/vol/www",
         "ro" : "false" 
      },
      {
         "name" : "httpd",
         "path" : "/vol/httpd",
         "ro" : "false" 
      },
      {
         "name" : "radarr",
         "path" : "/vol/radarr",
         "ro" : "false" 
      },
      {
         "name" : "lidarr",
         "path" : "/vol/lidarr",
         "ro" : "false" 
      },
      {
         "name" : "sonarr",
         "path" : "/vol/sonarr",
         "ro" : "false" 
      },
      {
         "name" : "transmission",
         "path" : "/vol/transmission",
         "ro" : "false" 
      }
   ],
   "vnc" : "off",
   "xhci" : "off",
   "zonename" : "ares",
   "zonepath" : "/zones/ares" 
}
root@saturn:~# zadm show artemis
{
   "acpi" : "on",
   "autoboot" : "true",
   "bootargs" : "",
   "bootdisk" : {
      "blocksize" : "4K",
      "path" : "rpool/vms/artemis/disk0",
      "size" : "35G",
      "sparse" : "false" 
   },
   "bootorder" : "cd",
   "bootrom" : "BHYVE_RELEASE-202111",
   "brand" : "bhyve",
   "cloud-init" : "off",
   "cpu-shares" : "25",
   "dataset" : [
      {
         "name" : "rpool/vmdata/artemis" 
      }
   ],
   "diskif" : "nvme",
   "fs-allowed" : "",
   "hostbridge" : "i440fx",
   "hostid" : "",
   "ip-type" : "exclusive",
   "limitpriv" : "default",
   "net" : [
      {
         "global-nic" : "ixgbe0",
         "mac-addr" : "00:22:06:63:25:df",
         "physical" : "artemis0",
         "vlan-id" : "100",
         "vqsize" : "16384" 
      }
   ],
   "netif" : "virtio",
   "pool" : "",
   "ram" : "8G",
   "rng" : "off",
   "scheduling-class" : "",
   "type" : "generic",
   "vcpus" : "12,sockets=1,cores=6,threads=2",
   "vga" : "off",
   "virtfs" : [
      {
         "name" : "plexdata",
         "path" : "/vol/plexdata",
         "ro" : "false" 
      }
   ],
   "vnc" : "off",
   "xhci" : "off",
   "zonename" : "artemis",
   "zonepath" : "/zones/artemis" 
}
root@saturn:~# zadm show rcon2
{
   "acpi" : "on",
   "autoboot" : "true",
   "bootargs" : "",
   "bootdisk" : {
      "blocksize" : "4K",
      "path" : "rpool/vms/rcon2/disk0",
      "size" : "10G",
      "sparse" : "false" 
   },
   "bootorder" : "cd",
   "bootrom" : "BHYVE_RELEASE-202111",
   "brand" : "bhyve",
   "cloud-init" : "off",
   "cpu-shares" : "25",
   "diskif" : "nvme",
   "fs-allowed" : "",
   "hostbridge" : "i440fx",
   "hostid" : "",
   "ip-type" : "exclusive",
   "limitpriv" : "default",
   "netif" : "virtio",
   "pool" : "",
   "ppt" : [
      {
         "device" : "ppt0",
         "state" : "on" 
      },
      {
         "device" : "ppt1",
         "state" : "on" 
      },
      {
         "device" : "ppt5",
         "state" : "on" 
      }
   ],
   "ram" : "1G",
   "rng" : "off",
   "scheduling-class" : "",
   "type" : "generic",
   "vcpus" : "1,sockets=1,cores=1,threads=1",
   "vga" : "off",
   "vnc" : "off",
   "xhci" : "off",
   "zonename" : "rcon2",
   "zonepath" : "/zones/rcon2" 
}
root@saturn:~# zadm show proteus
{
   "acpi" : "on",
   "autoboot" : "true",
   "bootargs" : "",
   "bootdisk" : {
      "blocksize" : "8K",
      "path" : "rpool/vms/proteus/disk0",
      "size" : "25G",
      "sparse" : "true" 
   },
   "bootorder" : "cd",
   "bootrom" : "BHYVE_RELEASE-202111",
   "brand" : "bhyve",
   "cloud-init" : "off",
   "cpu-shares" : "25",
   "diskif" : "nvme",
   "fs-allowed" : "",
   "hostbridge" : "i440fx",
   "hostid" : "",
   "ip-type" : "exclusive",
   "limitpriv" : "default",
   "netif" : "virtio",
   "pool" : "",
   "ppt" : [
      {
         "device" : "ppt4",
         "state" : "on" 
      }
   ],
   "ram" : "2G",
   "rng" : "off",
   "scheduling-class" : "",
   "type" : "generic",
   "vcpus" : "4,sockets=1,cores=2,threads=2",
   "vga" : "off",
   "vnc" : "off",
   "xhci" : "off",
   "zonename" : "proteus",
   "zonepath" : "/zones/proteus" 
}

The following is able to trigger it for me, but not consisteantly:

yt-dlp -c --merge-output-format mp4 --format bestvideo+bestaudio  --compat-options all,-no-live-chat --add-metadata --embed-subs --all-subs --convert-subs ass https://www.youtube.com/watch?v=IdRagVTlrmo

I had it die once in the rar command about 10% in, but the 2nd attempt I manage todo everything so also not really reliable.

rar -v256000k a vmdump.rar vmdump.3
par2 create -m1024 -r10 vmdump.partX.rar vmdump.*.rar
# note: damage some rar files
dd if=/dev/urandom count=8 of=vmdump.part13.rar conv=notrunc
par2verify vmdump.partX.rar.par2
par2repair vmdump.partX.rar.par2

Actions #4

Updated by Jorge Schrauwen 5 days ago

Another one, this one got stuck as the linux kernel failed to reboot, including console output, vmm stack, pstack, ...
https://gist.github.com/sjorge/34165dd781a4646ab7b71d2422444443

Actions #5

Updated by Jorge Schrauwen 4 days ago

After running on a build with 13896 reverted, the vm just died instead and bhyve logged an EPT violation

[23:34:52] <sjorge> rdmsr to register 0x64d on vcpu 0
[23:34:52] <sjorge> vm exit[3]
[23:34:52] <sjorge> reason          VMX
[23:34:52] <sjorge> rip             0x00000000006f0c20
[23:34:52] <sjorge> inst_length     5
[23:34:52] <sjorge> status          0
[23:34:52] <sjorge> exit_reason     48 (EPT violation)
[23:34:52] <sjorge> qualification   0x0000000000000184
[23:34:52] <sjorge> inst_type               0
[23:34:52] <sjorge> inst_error              0
Actions #6

Updated by Jorge Schrauwen about 19 hours ago

All this was on a patchset with all the memory related commits from the past few months reverted, I will update to tonights bloody + patchset from andy to see if it also stays OK on the newer bits.

[   14.200961] fbcon: Taking over console
 09:24:06 up 1 day, 58 min,  0 users,  load average: 0.07, 0.03, 0.00

We made it past 24h without any issues! But...

This is with Hyper Threading disabled in the BIOS, it was enabled before. I noticed I was seeing more cores than expected while collecting some info, I had HT disabled in the past given this is older gen CPU and well spectr/meltdown and friends happened.

This has also brought some other improvements, these messages are now gone on all my VMs

[11:50:41] <sjorge> [  463.930275] hrtimer: interrupt took 147684029 ns
[11:50:42] <sjorge> [  570.011699] clocksource: timekeeping watchdog on CPU2: hpet read-back delay of 1910269ns, attempt 4, marking unstable
[11:50:42] <sjorge> [  570.788413] tsc: Marking TSC unstable due to clocksource watchdog
[11:50:42] <sjorge> [  570.788977] TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
[11:50:42] <sjorge> [  570.788981] sched_clock: Marking unstable (570576256076, 210875811)<-(570795670587, -6693491)
[11:50:42] <sjorge> [  570.787375] clocksource: Checking clocksource tsc synchronization from CPU 2.
[11:50:42] <sjorge> [  570.787516] clocksource: Switched to clocksource hpet

Eventually it would also kick out HPET and switch to PIT for timings in the VM. So it seems Hyper Threads are at lease causing some kind of unstable clock for the VM and it's not happy, I'm not sure but there might be a relation with the page faults?

Less interesting blurp:
The above mentioned EPT Violation was hit once after the system booted, after enabling the debug flag to not take down the zone the VM did hang a few more times before I disabled HT (not after), but it would eventually reset the guest without the EPT violation showing up in the logs, the console had the usual strack traces (I had already remove panic=15 from the grub linux params)

I don't think it ever got into the state pmooney was interested in (dead vm with ept violation in log).

I'll update once todays bloody get dropped and andy provides a patch to see if the its also stable wit HT disabled.

Actions

Also available in: Atom PDF