Bug #1439
closedPanic when booting on IBM SystemX
100%
Description
Right after loading the kernel, there is a panic "vmem_hash_delete: bad free" called from fm_smb_fmacompat, see attached screenshot. The code in uts/intel/os/fmsmb.c:562 seems to do more allocs than frees anyway, can someone please clean this up?
Files
Updated by Jens Rosenboom over 12 years ago
Some more findings:
The problem only occurs with recent UEFI versions, 1.11 and 1.12 for my x3650M3. After a downgrade to 1.10, the system runs without a problem.
After inserting a "goto bad" right at the start of fm_smb_fmacompat, the machine also works fine with the 1.12 version.
Looking at the smbios data, I can even guess where the crash comes from: The final type 11 record doesn't have a string attached, leaving cnt=0 when the "for i" loop finishes and thus giving bad parameters for the kmem_free call.
Updated by Jens Rosenboom over 12 years ago
I made a patch for this, can someone please check this? Works for me, but no guarantees otherwise. Might be good if it would be tested on a board that has this SUNW-PRMS-1 string, too
Updated by Jens Rosenboom over 12 years ago
New version, thanks to richlowe for spotting the mistake
Updated by Jon Strabala almost 12 years ago
- File issue_00_similar_panic_messages_and_stack.jpg issue_00_similar_panic_messages_and_stack.jpg added
- File issue_UEFI__1_13_also_fails.jpg issue_UEFI__1_13_also_fails.jpg added
Hit the same issue on a brand new IBM x3550 M3 but on a newer BIOS, e.g. UEFI 1.13 (build date 9/23/2011) when doing a fresh install.
Since I can not install oi_151a, I can not apply the patch. Looks like I need to make my own ISO (sort of a pain) until there is a new ISO release (hopefully with the patch in it).
Updated by Jon Strabala almost 12 years ago
- File issue_work_around.jpg issue_work_around.jpg added
Looking at the file ./usr/src/uts/intel/os/fmsmb.c it occurred to me that x86gentopo_legacy can be set in /etc/system and this lead me to ask furhter questions in the #illumos IRC, I determined with help from (Daemar) that I could indeed boot from grub by adding -kd to drop into a kmdb prompt early on and then type the following:
x86gentopo_legacy /W 1
to disable the path that causes the 'panic' followed immediately by
:c
to leave kmdb and continue execution. This got me to an installer! I have attached an image of the system up and running.
Then I can use the kmdb trick until I can get an /etc/system setup with a x86gentopo_legacy fix and then eventually a make and boot into new BE based with the patch described above (where the kmdb trick or a modified /etc/system is not needed).
Updated by Jon Strabala almost 12 years ago
As discussed the oi_151a live CD installs and runs on my IBM x3550 M3 with UEFI (BIOS) rev 1.13 in the default oi_151a BE (openindiana) as long as I have the following setting in my /etc/system:
set x86gentopo_legacy=1
Once I built and booted into a new BE (nightly-2012-01-04-uefi) with the above patch applied, e.g. https://www.illumos.org/issues/1439#note-3 , I could then remove the above setting from my /etc/system file and the system still worked just fine.
root@systemx:~# beadm list BE Active Mountpoint Space Policy Created nightly-2012-01-04-uefi NR / 9.00G static 2012-01-04 17:03 openindiana - - 58.8M static 2012-01-03 19:06
Thus for what its worth patch works fine on another system (albeit another IBM System X).
Updated by Jon Strabala almost 12 years ago
I can apply patch https://www.illumos.org/issues/1439#note-3 to both an IBM SystemX (above) and a non-IBM system (e.g. Supermicro X9SCA-F nightly BE nightly-2012-01-17-uefi) in both cases the resulting BE seems to work just fine.
I do not have other machines available to test. Is there anything else I can do to help get this patch integrated?
Updated by Rich Lowe almost 12 years ago
- Category set to kernel
- Status changed from New to Resolved
- % Done changed from 0 to 100
- Tags deleted (
needs-triage)
Resolved in r13613 commit:8abd7b12d92f