Bug #5943

grub fails to boot when too many boot environments are present

Added by Rich Lowe about 2 years ago. Updated almost 2 years ago.

Status:NewStart date:2015-05-20
Priority:HighDue date:
Assignee:-% Done:

100%

Category:bootloader
Target version:-
Difficulty:Hard Tags:

Description

When too many boot environments are present (it is not clear how many is "too many", or whether the value is constant), grub will fail to correctly load the system. The symptoms vary somewhat (unfortunately), but uninterprettable errors from krtld are common.

The problem is that grub has no real concept of memory management, it merely passes around a pointer and treats all memory above this pointer as both present, and free. A large menu.lst causes our memory use in the parsing stage to run into either storage we define to be at constant addresses higher than the notional 'heap', holes in memory, or the like.

A sure workaround is to boot from alternate media, and remove entries from the menu.lst. A potential workaround (that can't be guaranteed to work) is to get to the grub command line, and type the entry needed to boot manually.

Fixing grub 1.x will be a major labour, a workaround such as an artificial limit on the number of BEs is attractive though the number is hard to pin down. 25-30 seems safe.


Subtasks

Bug #5944: Related to 5943 (please merge)Closed


Related issues

Related to illumos gate - Bug #6326: Eliminate unused filesystems from GRUB's stage 2 to bring back free memory Closed 2015-10-13

History

#1 Updated by Igor Kozhukhov about 2 years ago

will be better provide limit in config file, because it's not failed on my env with more boot env.

#2 Updated by Marcel Telka about 2 years ago

Copied from #5944:

This is related to bug 5943 and includes instructions how to reproduce. I'm not seeing a way to add these reproduction instructions to that bug, so I'm creating a new bug post on how to do so.

Instructions:
  1. cd /rpool/boot/grub/
  2. cp menu.lst{,.bak}
  3. cat menu.lst.bak >> menu.list

Reboot until the issue is experienced. For me, it was when menu.lst was greater than 100K.

It's possible that the menu.lst can become so big that you cannot even boot via grub command-line; forcing a live CD boot to manually cleanup the menu.lst receiving the following error:

grub> bootfs rpool/ROOT/omnios-r151014
Error 28: Selected item cannot fit into memory

#3 Updated by Nikola M. almost 2 years ago

This makes illumos en large as not production ready OS.
If after undetermined number of updates number of Boot Environments grow larger in menu.lst,
all illumos machines using GRUB could fail to boot!
it is mentioned on IRC that it is about buffer reserved in Stage2 in GRUB, maybe that buffer can be larger,
determined how much of BE's it can handle in menu.lst and appropriate warning to be put in all tools changing menu.lst to avoid failure to boot for illumos. I experienced something like this on my desktop several months I was lost untill I manually got menu.lst smaller.
This needs fixing in both GRUB and tools changing menu.lst, before Boot loader project update gets ready.

#4 Updated by Dan McDonald almost 2 years ago

  • Related to Bug #6326: Eliminate unused filesystems from GRUB's stage 2 to bring back free memory added

Also available in: Atom