Project

General

Profile

Bug #11701

ldi_handle dcmd segfaults occasionally

Added by Rob Johnston 7 months ago. Updated 7 months ago.

Status:
New
Priority:
Normal
Assignee:
Category:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:

Description

This is to track upstreaming the following issue from illumos-joyent:

OS-ldi_handle dcmd segfaults occasionally

For details, rationale and testing notes, please refer to the original SmartOS issue:

https://smartos.org/bugview/OS-7691


Related issues

Is duplicate of illumos gate - Bug #4785: mdb crashes in ::ldi_handleIn Progress2014-04-20

Actions

History

#1

Updated by Rob Johnston 7 months ago

Running ::ldi_handle, from mdb -k, I would see the following error, typically on the second or third run:

> ::ldi_handle
*** mdb: received signal SEGV at:
    [1] mdb`mdb_getopts+0x1bb()
    [2] genunix.so`ldi_handle+0x5a()
    [3] mdb`dcmd_invoke+0x7c()
    [4] mdb`mdb_call_idcmd+0x112()
    [5] mdb`call_idcmd+0xc1()
    [6] mdb`walk_dcmd+0x43()
    [7] genunix.so`ldi_handle_walk_step+0xcb()
    [8] mdb`walk_step+0x7f()
    [9] mdb`walk_common+0x77()
    [10] mdb`mdb_pwalk_dcmd+0xfc()
    [11] mdb`mdb_walk_dcmd+0x20()
    [12] genunix.so`ldi_handle+0xc4()
    [13] mdb`dcmd_invoke+0x7c()
    [14] mdb`mdb_call_idcmd+0x112()
    [15] mdb`mdb_call+0x449()
    [16] mdb`yyparse+0xe1e()
    [17] mdb`mdb_run+0x2cd()
    [18] mdb`main+0xfa1()
    [19] mdb`_start_crt+0x83()
    [20] mdb`_start+0x18()
mdb: (c)ore dump, (q)uit, (r)ecover, or (s)top for debugger [cqrs]?
This error was also consistently reproducible on the first run by using the -i flag, i.e.,
::ldi_handle -i.

Its stack is:

> ::stack
mdb_getopts+0x1bb(0, 0)
genunix.so`ldi_handle+0x5a(fffffe59d6532418, 7, 0, 0)
dcmd_invoke+0x7c(b817f0, fffffe59d6532418, 7, 0, 0, 0)
mdb_call_idcmd+0x112(b817f0, fffffe59d6532418, 1, 7, fffffc7fffdfe688, 0)
call_idcmd+0xc1(b817f0, fffffe59d6532418, 1, 7, fffffc7fffdfe848)
walk_dcmd+0x43(fffffe59d6532418, 0, fffffc7fffdfe840)
genunix.so`ldi_handle_walk_step+0xcb(58dcc0)
walk_step+0x7f(58dcc0)
walk_common+0x77(58dcc0)
mdb_pwalk_dcmd+0xfc(fffffc7fee10804e, fffffc7fee10804e, 0, 0, 0)
mdb_walk_dcmd+0x20(fffffc7fee10804e, fffffc7fee10804e, 0, 0)
genunix.so`ldi_handle+0xc4(fffffe59cd29f690, 0, 0, 0)
dcmd_invoke+0x7c(b817f0, fffffe59cd29f690, 0, 0, 0, 0)
mdb_call_idcmd+0x112(b817f0, fffffe59cd29f690, 1, 0, 58ddb8, 58ddd0)
mdb_call+0x449(fffffe59cd29f690, 1, 0)
yyparse+0xe1e()
mdb_run+0x2cd()
main+0xfa1(2, fffffc7fffdffb98, fffffc7fffdffbb0)
_start_crt+0x83()
_start+0x18()

After code inspection, I found that genunix.so`ldi_handle does not correctly use mdb_getopts. The MDB guide states that the caller of mdb_getopts, which takes variadic arguments, should terminate the list of arguments with NULL. The code that calls mdb_getopts does not terminate its args list with NULL.

I tested a using a new version of genunix.so with a change that correctly passed the arguments, and the segfaults went away.

#2

Updated by Gergő Mihály Doma 7 months ago

  • Is duplicate of Bug #4785: mdb crashes in ::ldi_handle added

Also available in: Atom PDF