Project

General

Profile

Bug #8037

hostname: boot hang during OI install

Added by John Howard almost 3 years ago. Updated almost 3 years ago.

Status:
New
Priority:
Normal
Assignee:
Category:
kernel
Start date:
2017-04-04
Due date:
% Done:

0%

Estimated time:
0.50 h
Difficulty:
Medium
Tags:
needs-triage

Description

Failure is after displaying the Oracle copyright line and then during the /etc/hostname.
(Personally, I would eliminate the pointless hostname checking during the OI installer if I could find out how.)

I have tested all OI releases since 2015330 including the recent 20170326 OI test from Toomas Soome.
All fail to install on my main development building AMD64 PC at the same point. OI installs and works on my other testing AMD64 PC.

Someone alerted me to possible uname mishandling of NULL uuid so I researched it. I don't know if this is directly related to the hostname problem. I am not a C coder.

The 2014 Hewlett Packard PC change to NULL uuid handling by Mr. Losh from FreeBSD.org that are still present in illumos-gate are not in the current FreeBSD sources I checked last month.
I don't have an HP though. His change incorrectly applies generally to all PC's for the NULL uuid usecase. Maybe this is the cause. I used FreeBSD on my main development PC a few years ago.
It seems to me the uuid code in illumos kernel should at least be updated to the current FreeBSD identical source files to remove any doubt.

History

#1

Updated by Toomas Soome almost 3 years ago

John Howard wrote:

Failure is after displaying the Oracle copyright line and then during the /etc/hostname.
(Personally, I would eliminate the pointless hostname checking during the OI installer if I could find out how.)

I have tested all OI releases since 2015330 including the recent 20170326 OI test from Toomas Soome.
All fail to install on my main development building AMD64 PC at the same point. OI installs and works on my other testing AMD64 PC.

Someone alerted me to possible uname mishandling of NULL uuid so I researched it. I don't know if this is directly related to the hostname problem. I am not a C coder.

The 2014 Hewlett Packard PC change to NULL uuid handling by Mr. Losh from FreeBSD.org that are still present in illumos-gate are not in the current FreeBSD sources I checked last month.
I don't have an HP though. His change incorrectly applies generally to all PC's for the NULL uuid usecase. Maybe this is the cause. I used FreeBSD on my main development PC a few years ago.
It seems to me the uuid code in illumos kernel should at least be updated to the current FreeBSD identical source files to remove any doubt.

Can you try the boot as of: press esc to get to boot loader ok prompt, then enter: boot -vk -B prom_debug=true,kbm_debug=true

As an first thing, this will give us more information what is going on behind the otherwise quite silent console;

Secondly, once the system is quiet and appears stuck, press F1+a, hopefully you get the kmdb prompt, from there enter ::stack

This will also give some more information.

#2

Updated by John Howard almost 3 years ago

Testing official OI 20170313 USB image (OpenIndiana_minimal.usb):
When stuck I pressed F1+a but nothing happens. No kmdb prompt. I can make
the F-Lock LED on my Logitech USB keyboard turn OFF/ON but that is all. I
think the kernel is in a trap.

Testing Toomas Soome's OI 20170326 USB image (OpenIndiana_Text_X86.usb):
Same result.

I thought more displayed data might help so I maximized my display resolution
to 1280x800x32 and chose the smallest font. It displayed repeatedly similar
information.

ok framebuffer set 0x11b
ok loadfont /boot/fonts/6x12.fnt
ok boot -vk -B prom_debug=true,kbm_debug=true

Display shows:
Doing BIOS call...br.ax is 0
br.bx is 0
br.dx is 83
done

Doing BIOS call...br.ax is 201
br.bx is 7000
br.dx is 83
done

Doing BIOS call...br.ax is 800
br.bx is 0
br.dx is 84
done

Doing BIOS call...br.ax is 800
br.bx is 0
br.dx is 85
done

Doing BIOS call...br.ax is 800
br.bx is 0
br.dx is 86
done

Doing BIOS call...br.ax is 800
br.bx is 0
br.dx is 87
done

-- John

#3

Updated by John Howard almost 3 years ago

The OpenIndiana_Text_X86.iso image /boot/ramdisk.safe text file reveals the point to execute /etc/hostname. And the /boot/ramdisk.list also lists /etc/hostname. I discovered this from the root directory Master.master text file listing the contents of the ISO.

My guess is the presumed /boot/ramdisk.safe reference to /etc/hostname is the easiest path to hell to eliminate. How do I do it? As a last resort I could replace the /etc/hostname command with an empty dummy stub. But there is a fundamental problem.

First I need to be able to minimal build from scratch because I will need to change either the kernel installer source code or the hostname source code.

I am not yet familiar enough with either the kernel build or OI build and installer procedures to know how to remove those references to hostname to move the problem away from the critical installation bootup phase. Help on the easiest matter of daily building the kernel would allow me to eliminate this show stopper. For example, the GCC v4.4.4 compiler required to rebuild the kernel is not included in the minimal Live images. Enabling new users to rebuild the kernel should be a high priority. It builds confidence and multiplies users testing for problems.

Debugging the kernel is the last thing I want to try first as a new user of whatever new release of the illumos kernel or OI. Now we know kmdb cannot help with this problem yet.

Requiring a broken hostname syscall during installation is not an implementation flaw but a fundamental design flaw.

-- John

Also available in: Atom PDF