Bug #4229

mdb hangs on exit when long umem cache names exist

Added by Josef Sipek over 7 years ago. Updated about 7 years ago.

mdb - modular debugger
Start date:
Due date:
% Done:


Estimated time:
Gerrit CR:


libumem allows caches to have an ASCII name - up to 31 chars. If a 31-char long name is used, mdb will hang on exit. (A shorter cache name works as expected.)

$ <download attached test file>
$ gcc -Wall -O1 -lumem -o test test.c
$ ./test 30
making cache of length 30: 123456789012345678901234567890
Segmentation Fault (core dumped)
$ echo '' | mdb core
$ ./test 31
making cache of length 31: 1234567890123456789012345678901
Segmentation Fault (core dumped)
$ echo '' | mdb core

The hang is mdb pegging a CPU looping inside the teardown code:

nv_hashstring+0x2c(81b2e9d, 8102c68, 8046828, 8074053)
mdb_nv_lookup+0x10(8102c68, 81b2e9d, 81038b0, 81b09dd)
mdb_module_unload_common+0x11e(81038b0, 8139f28, 80468c8, 809af3f, 0, 54)
mdb_module_unload+0xe(81038b0, 2, 80468d8, 2)
mdb_destroy+0x3a(8103e98, 8139f10, 80476b8, 8083b76, 0, 8139f10)
terminate+0x10(0, 8139f10, 80476b8, 8083ad9)
main+0x14d2(2, 80476e8, 80476f4, 80476dc)
_start+0x7d(2, 8047830, 8047834, 0, 804783e, 804784a)

Profiling stacks using DTrace shows that we're looping somewhere in:


The culprit seems to be libumem:

$ sudo dtrace -n 'pid$target::mdb_module_unload:entry{ trace(copyinstr(arg0)) }' -p `pgrep mdb`
dtrace: description 'pid$target::mdb_module_unload:entry' matched 1 probe
CPU     ID                    FUNCTION:NAME
  0  77990          mdb_module_unload:entry                          
  0  77990          mdb_module_unload:entry                        
  0  77990          mdb_module_unload:entry 


test.c (755 Bytes) test.c Josef Sipek, 2013-10-18 12:51 AM

Updated by Josef Sipek over 7 years ago

Quick update:

Changing the MDB_NV_NAMELEN #define from 31 to 32 makes mdb happy. In other words, the issue is with mdb_var_t's cap on internally stored variable names.


Updated by Josef Sipek about 7 years ago

  • Category set to mdb - modular debugger

Updated by Josef Sipek about 7 years ago

Ok, so summary so far...

libumem has cache names. The names can be up to 32 bytes long (including the trailing null). The libumem mdb dmod instantiates a walker for each cache and uses the cache's name for the walker's name. mdb keeps track of walker names using a mdb_nv_t & mdb_var_t. mdb_var_t can store the name either inline or externally (MDB_NV_EXTNAME). There is a limit on the maximum length of an inline name - MDB_NV_NAMELEN (31). The libumem dmod always tries to use inline names.

mdb_var_alloc() gets the name, and if inline names are supposed to be used, it strncpy's it in. If the name was longer than possible, it is silently truncated. Then on mdb shutdown, the code tries to unregister all the walkers but it ends up looping forever because the name of the walker that it expects does not match the name of the mdb_var_t that's keeping track of it.

As a sidenote, mdb_var_alloc always allocates sizeof(mdb_var_t) + 31 if inline names are to be used - regardless of the length of the name.


Updated by Josef Sipek about 7 years ago

Here are the possible solutions...

  1. update every user of mdb_nv_insert to check the length and strdup as necessary - this is error prone and causes unnecessary code duplication
  2. change mdb_var_alloc to strdup long names when MDB_NV_EXTNAME wasn't specified, and automatically enables MDB_NV_EXTNAME
  3. change mdb_var_alloc to allow any length name inline

Of these solutions, the third one seems to be the cleanest.

The second and third are really the same, but the second splits the allocation into two and then stashes the pointer in mdb_var_t. Additionally, the second approach will also conflict with MDB_NV_OVERLOAD since the external name pointer is shared by MDB_NV_OVERLOAD and MDB_NV_EXTNAME. (The two could be decoupled and that's why I'm including it as a valid approach.)

 * A variable may have multiple definitions (v_ndef chain), but this feature
 * is mutually exclusive with MDB_NV_EXTNAME in order to save space.

Updated by Josef Sipek about 7 years ago

  • Assignee set to Josef Sipek

Updated by Rich Lowe about 7 years ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100
  • Tags deleted (needs-triage)

Resolved in 065c692

Also available in: Atom PDF