Project

General

Profile

Bug #12964

Continuously reboot after installing the latest update

Added by gh origin 24 days ago. Updated 3 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:

Description

Like the title. It reboots continuously. It's so fast that I can't even know what's exactly happening. I only recall that I could get into the boot loader, boot into the system, was greeted with a black screen, then it reboot immediately.


Related issues

Related to OpenIndiana Distribution - Bug #12965: starting Lightdm triggers reboot when graphics card is Nvidia GTX 1650New

Actions

History

#1

Updated by Toomas Soome 22 days ago

gh origin wrote:

Like the title. It reboots continuously. It's so fast that I can't even know what's exactly happening. I only recall that I could get into the boot loader, boot into the system, was greeted with a black screen, then it reboot immediately.

could you try: boot -k

#2

Updated by gh origin 20 days ago

Toomas Soome wrote:

gh origin wrote:

Like the title. It reboots continuously. It's so fast that I can't even know what's exactly happening. I only recall that I could get into the boot loader, boot into the system, was greeted with a black screen, then it reboot immediately.

could you try: boot -k

I tried it and the system stuck at the black screen. I have to cold reboot it.

I found the problem to be something with the graphics stack. After I rolled back to the previous be (thanks god there was beadm!) I could boot to graphical environment again. But since the system is not updated, I can't install any packages that depends on the packages needed to be update. So I'm stuck surfing the web but can't do anything.

My system is using Intel integrated graphics. I'm a Linux user just recently migrated to OI so I don't know which commands to run to give you more information about my system. Please give me instructions and I will provide you the information you need. Thank you very much.

Update: Looking at the size of the packages needed to download of the update, I think it's not changed much since the last time I did an update on my OI installation on VirtualBox (I wiped it and the Linux host to have room to install OI on real hardware). OI on VirtualBox seemed to not affected by this update. After doing the update and reboot, it automatically reboot once (without going to a black green, before X got started, before the login prompt appear, as I recall) then everything is normal. So I think it's more and more point to some changes in the graphics stack that caused this.

#3

Updated by gh origin 20 days ago

  • Related to Bug #12965: starting Lightdm triggers reboot when graphics card is Nvidia GTX 1650 added
#4

Updated by gh origin 18 days ago

cat /var/adm/messages | grep savecore

Jul 19 17:11:01 openindiana savecore: [ID 570001 auth.error] reboot after panic: BAD TRAP: type=e (#pf Page fault) rp=fffffe001286e710 addr=0 occurred in module "unix" due to a NULL pointer dereference

Jul 19 17:11:01 openindiana savecore: [ID 184409 auth.error] Panic crashdump pending on dump device but dumpadm -n in effect; run savecore(1M) manually to extract. Image UUID 6da11e6b-517a-c91a-bc0b-dee581c9ec32.

Jul 19 17:11:02 openindiana AUTO-RESPONSE: The failed system image was dumped to the dump device. If savecore is enabled (see dumpadm(1M)) a copy of the dump will be written to the savecore directory .

Jul 19 17:11:02 openindiana IMPACT: There may be some performance impact while the panic is copied to the savecore directory. Disk space usage by panics can be substantial.

Jul 19 17:11:02 openindiana REC-ACTION: If savecore is not enabled then please take steps to preserve the crash image.

#5

Updated by Toomas Soome 18 days ago

gh origin wrote:

cat /var/adm/messages | grep savecore

Jul 19 17:11:01 openindiana savecore: [ID 570001 auth.error] reboot after panic: BAD TRAP: type=e (#pf Page fault) rp=fffffe001286e710 addr=0 occurred in module "unix" due to a NULL pointer dereference

Jul 19 17:11:01 openindiana savecore: [ID 184409 auth.error] Panic crashdump pending on dump device but dumpadm -n in effect; run savecore(1M) manually to extract. Image UUID 6da11e6b-517a-c91a-bc0b-dee581c9ec32.

Jul 19 17:11:02 openindiana AUTO-RESPONSE: The failed system image was dumped to the dump device. If savecore is enabled (see dumpadm(1M)) a copy of the dump will be written to the savecore directory .

Jul 19 17:11:02 openindiana IMPACT: There may be some performance impact while the panic is copied to the savecore directory. Disk space usage by panics can be substantial.

Jul 19 17:11:02 openindiana REC-ACTION: If savecore is not enabled then please take steps to preserve the crash image.

use savecore command to save kernel dump (to /var/crash), then savecore -vf to unpack it; then you can use mdb 0, enter ::msgbuf and ::stack

#6

Updated by gh origin 17 days ago

Toomas Soome wrote:

gh origin wrote:

cat /var/adm/messages | grep savecore

Jul 19 17:11:01 openindiana savecore: [ID 570001 auth.error] reboot after panic: BAD TRAP: type=e (#pf Page fault) rp=fffffe001286e710 addr=0 occurred in module "unix" due to a NULL pointer dereference

Jul 19 17:11:01 openindiana savecore: [ID 184409 auth.error] Panic crashdump pending on dump device but dumpadm -n in effect; run savecore(1M) manually to extract. Image UUID 6da11e6b-517a-c91a-bc0b-dee581c9ec32.

Jul 19 17:11:02 openindiana AUTO-RESPONSE: The failed system image was dumped to the dump device. If savecore is enabled (see dumpadm(1M)) a copy of the dump will be written to the savecore directory .

Jul 19 17:11:02 openindiana IMPACT: There may be some performance impact while the panic is copied to the savecore directory. Disk space usage by panics can be substantial.

Jul 19 17:11:02 openindiana REC-ACTION: If savecore is not enabled then please take steps to preserve the crash image.

use savecore command to save kernel dump (to /var/crash), then savecore -vf to unpack it; then you can use mdb 0, enter ::msgbuf and ::stack

Here is vmdump.0 : https://ufile.io/o1q61pkm

#7

Updated by Dan-Simon Myrland 16 days ago

I have the exact same issue, also using Intel integrated graphics.

#8

Updated by gh origin 16 days ago

Dan-Simon Myrland wrote:

I have the exact same issue, also using Intel integrated graphics.

I encountered the similar problem with drm-kmod on FreeBSD 11.4, too. But I could revert back to the old i915kms.ko module shipped with the system in /boot/kernel. Here is my thread on the FreeBSD forum:

https://forums.freebsd.org/threads/drm-kmod-dump-core.76216/

I know OI ported many stuffs from FreeBSD. Could this be somehow related?

The rolling nature of OI make it hard for me. On FreeBSD, even if I used the old kms module, I still could update my system with freebsd-update and install new packages with FreeBSD's pkg. On OI, I was blocked from installing new packages if my system is not fully updated to the latest version since these packages were built with the latest system. I'm stuck surfing the web but can't do anything on my OI installation.

#9

Updated by Toomas Soome 16 days ago

gh origin wrote:

Toomas Soome wrote:

gh origin wrote:

cat /var/adm/messages | grep savecore

Jul 19 17:11:01 openindiana savecore: [ID 570001 auth.error] reboot after panic: BAD TRAP: type=e (#pf Page fault) rp=fffffe001286e710 addr=0 occurred in module "unix" due to a NULL pointer dereference

Jul 19 17:11:01 openindiana savecore: [ID 184409 auth.error] Panic crashdump pending on dump device but dumpadm -n in effect; run savecore(1M) manually to extract. Image UUID 6da11e6b-517a-c91a-bc0b-dee581c9ec32.

Jul 19 17:11:02 openindiana AUTO-RESPONSE: The failed system image was dumped to the dump device. If savecore is enabled (see dumpadm(1M)) a copy of the dump will be written to the savecore directory .

Jul 19 17:11:02 openindiana IMPACT: There may be some performance impact while the panic is copied to the savecore directory. Disk space usage by panics can be substantial.

Jul 19 17:11:02 openindiana REC-ACTION: If savecore is not enabled then please take steps to preserve the crash image.

use savecore command to save kernel dump (to /var/crash), then savecore -vf to unpack it; then you can use mdb 0, enter ::msgbuf and ::stack

Here is vmdump.0 : https://ufile.io/o1q61pkm

panic[cpu1]/thread=fffffe0bb3938c20:
BAD TRAP: type=e (#pf Page fault) rp=fffffe001286e710 addr=0 occurred in module "unix" due to a NULL pointer dereference

Xorg:
#pf Page fault
Bad kernel fault at addr=0x0
pid=781, pc=0xfffffffffb88151b, sp=0xfffffe001286e808, eflags=0x10246
cr0: 80050033<pg,wp,ne,et,mp,pe> cr4: 1626f8<smep,osxsav,pcide,vmxe,xmme,fxsr,pge,mce,pae,pse,de>
cr2: 0
cr3: 1e772c000
cr8: 0

rdi:                0 rsi:                0 rdx: fffffe0bb3938c20
rcx: 1e000 r8: 1000 r9: 1
rax: 0 rbx: 0 rbp: fffffe001286e830
r10: fffffffffb875f78 r11: 2 r12: fffffe001286ea78
r13: 1000 r14: 1 r15: fffffe0bbfbb1798
fsb: fffffc7fef0833c0 gsb: fffffe0bacaeb000 ds: 0
es: 0 fs: 0 gs: 0
trp: e err: 2 rip: fffffffffb88151b
cs: 30 rfl: 10246 rsp: fffffe001286e808
ss: 38

fffffe001286e610 unix:die+c6 ()
fffffe001286e700 unix:trap+1169 ()
fffffe001286e710 unix:cmntrap+e9 ()
fffffe001286e830 unix:mutex_enter+b ()
fffffe001286e900 genunix:devmap_device+62 ()
fffffe001286e9d0 genunix:devmap_setup+229 ()
fffffe001286ea30 genunix:ddi_devmap_segmap+2b ()
fffffe001286eae0 i915:i915_gem_mmap_ioctl+de ()
fffffe001286ec20 drm:drm_ioctl+164 ()
fffffe001286ecc0 drm:drm_sun_ioctl+ef ()
fffffe001286ed00 genunix:cdev_ioctl+2b ()
fffffe001286ed50 specfs:spec_ioctl+45 ()
fffffe001286ede0 genunix:fop_ioctl+5b ()
fffffe001286ef00 genunix:ioctl+153 ()
fffffe001286ef10 unix:brand_sys_syscall+1fe ()

#10

Updated by Daniel Chan 15 days ago

I have the exact same issue. Installing the latest iso is fine. But after updating the system, the server keeps rebooting during accessing the login screen. Let me know if anything such as log is needed.

#11

Updated by gh origin 10 days ago

How could I edit my issue to increase it from Normal to High?

#12

Updated by gh origin 9 days ago

Hope you could resolve this issue soon, hopefully before 20.10 release. I would try OI 20.10 when it release with this issue resolved, but I'm not sure, though. Now I back to Linux. This issue has prevented me from doing anything useful on OI for a too long time. I lost all of my patient and even I don't want to, back to Linux is the only solution for me. Bye.

#13

Updated by Christoph Binner 7 days ago

Can confirm the issue, also using Intel graphics (Ivy Bridge).

The problem seems to be in the Xorg intel module, with /usr/lib/xorg/modules/drivers/amd64/intel_drv.so removed Xorg starts successfully using the VESA module.

As a temporary fix you can add an Xorg config file, e.g. /etc/X11/xorg.conf.d/20-intel.conf with this content:
Section "Device"
Identifier "Intel Graphics"
Driver "vesa"
EndSection

The Intel module seems to need debugging, suggestions on how to go about it would be welcome.

BTW several linux distributions have switched to "modesetting" instead of "intel" by default but that might require a fully working KMS implementation (?).

#14

Updated by Daniel Chan 7 days ago

Confirmed. Using VESA allowed accessing the updated system.
Seems intel gpu related stuff has stopped intel gpu system from entering lightdm.

#15

Updated by gh origin 7 days ago

What about simply revert back to the previous version? If the previous version worked fine, I think it's reasonable for the developers to revert back to the previous version of just the Xorg intel module and lock the package at this version until the bug with the newer version is fixed. You could keep it that state for a long time, enough to fix the bug itself or even port the DRM/KMS code from FreeBSD.

This is how we did the job on Linux, not let the users stuck for ages with a broken system. Rolling release doesn't need to be latest greatest. I'm OK with an old package but works than a new package but broken.

VESA could be OK for people running OI as server, indeed they don't need any GUI at all. But for people using it as a desktop and development machine like me, it's unacceptable. The Intel video driver on OI is not the latest, but I could watch 4K video just fine with it. Of course I watch it using Pale Moon, not the default Firefox. Please consider update your Firefox, too. It's even worse than Pale Moon. It can't play 4K video without problems, causing me have to stuck with Pale Moon, whose developers are unwelcome at me and banned almost 3 accounts of mine on their forum just because I have different idea than them!

#16

Updated by Daniel Chan 5 days ago

For firefox, I have a workaround:
1. zfs create -V 2G rpool/swap2
2. swap -a /dev/zvol/dsk/rpool/swap2
3. add "/dev/zvol/dsk/rpool/swap - - swap - no -" to /etc/vfstab

Actually, the swap space should be enough, but firefox works in my machine only when a second swap space is added.
But let's go back to this topic. Any one knows if the issue is a kernel issue which conflicts with the current intel driver or a purely driver issue?

#17

Updated by gh origin 4 days ago

Daniel Chan wrote:

But let's go back to this topic. Any one knows if the issue is a kernel issue which conflicts with the current intel driver or a purely driver issue?

Only god knows. The developers should be the ones that give more insights. But where are they now? Just give us a incomplete workaround using VESA and then disappear, left us alone here talking with each others like fools.

But let me guess. If it's simply the xorg intel driver, then it's simple. Just revert it back to the previous version as I said. So it should be something with the kernel. But if it's not so, and a simple revert to the previous xorg intel driver just works fine, then it's god damn irresponsible of the developers. Too disrespect the users!

#18

Updated by Aurélien Larcher 4 days ago

Some developers like me had Covid and are still suffering from consequences of the virus after 3 months of illness, other have family issues to attend.

I am still on sick leave but if you give more info on the date the breakage happened I will look into it.
I have not updated the driver in a while, just added a gcc-10 patch from trunk so it could be something else (or the patch from trunk is incorrect) .

#19

Updated by gh origin 3 days ago

Aurélien Larcher wrote:

Some developers like me had Covid and are still suffering from consequences of the virus after 3 months of illness, other have family issues to attend.

I am still on sick leave but if you give more info on the date the breakage happened I will look into it.
I have not updated the driver in a while, just added a gcc-10 patch from trunk so it could be something else (or the patch from trunk is incorrect) .

I didn't know about that. Is there anything like a blog where you could update the project's status so everyone could know? Just keep silent let speculation start and so on.

I'm now on Linux so this is no longer a problem for me. But the issue still exists, though. I hope when I try OI again (on a future release), the issue was fixed.

Hope you will get well soon. Take care.

Also available in: Atom PDF