Project

General

Profile

Bug #4229

mdb hangs on exit when long umem cache names exist

Added by Josef Sipek almost 6 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
mdb - modular debugger
Start date:
2013-10-18
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:

Description

libumem allows caches to have an ASCII name - up to 31 chars. If a 31-char long name is used, mdb will hang on exit. (A shorter cache name works as expected.)

$ <download attached test file>
$ gcc -Wall -O1 -lumem -o test test.c
$ ./test 30
making cache of length 30: 123456789012345678901234567890
Segmentation Fault (core dumped)
$ echo '' | mdb core
$ ./test 31
making cache of length 31: 1234567890123456789012345678901
Segmentation Fault (core dumped)
$ echo '' | mdb core
<hang>

The hang is mdb pegging a CPU looping inside the teardown code:

nv_hashstring+0x2c(81b2e9d, 8102c68, 8046828, 8074053)
mdb_nv_lookup+0x10(8102c68, 81b2e9d, 81038b0, 81b09dd)
mdb_module_remove_walker+0x1b(8102c50)
mdb_module_unload_common+0x11e(81038b0, 8139f28, 80468c8, 809af3f, 0, 54)
mdb_module_unload+0xe(81038b0, 2, 80468d8, 2)
mdb_module_unload_all+0x20(2)
mdb_destroy+0x3a(8103e98, 8139f10, 80476b8, 8083b76, 0, 8139f10)
terminate+0x10(0, 8139f10, 80476b8, 8083ad9)
main+0x14d2(2, 80476e8, 80476f4, 80476dc)
_start+0x7d(2, 8047830, 8047834, 0, 804783e, 804784a)

Profiling stacks using DTrace shows that we're looping somewhere in:

              mdb`mdb_module_unload+0xe
              mdb`mdb_module_unload_all+0x20
              mdb`mdb_destroy+0x3a
              mdb`terminate+0x10
              mdb`main+0x14d2
              mdb`_start+0x7d

The culprit seems to be libumem:

$ sudo dtrace -n 'pid$target::mdb_module_unload:entry{ trace(copyinstr(arg0)) }' -p `pgrep mdb`
dtrace: description 'pid$target::mdb_module_unload:entry' matched 1 probe
CPU     ID                    FUNCTION:NAME
  0  77990          mdb_module_unload:entry   ld.so.1                          
  0  77990          mdb_module_unload:entry   libc.so.1                        
  0  77990          mdb_module_unload:entry   libumem.so.1 

Files

test.c (755 Bytes) test.c Josef Sipek, 2013-10-18 12:51 AM

History

#1

Updated by Josef Sipek almost 6 years ago

Quick update:

Changing the MDB_NV_NAMELEN #define from 31 to 32 makes mdb happy. In other words, the issue is with mdb_var_t's cap on internally stored variable names.

#2

Updated by Josef Sipek almost 6 years ago

  • Category set to mdb - modular debugger
#3

Updated by Josef Sipek almost 6 years ago

Ok, so summary so far...

libumem has cache names. The names can be up to 32 bytes long (including the trailing null). The libumem mdb dmod instantiates a walker for each cache and uses the cache's name for the walker's name. mdb keeps track of walker names using a mdb_nv_t & mdb_var_t. mdb_var_t can store the name either inline or externally (MDB_NV_EXTNAME). There is a limit on the maximum length of an inline name - MDB_NV_NAMELEN (31). The libumem dmod always tries to use inline names.

mdb_var_alloc() gets the name, and if inline names are supposed to be used, it strncpy's it in. If the name was longer than possible, it is silently truncated. Then on mdb shutdown, the code tries to unregister all the walkers but it ends up looping forever because the name of the walker that it expects does not match the name of the mdb_var_t that's keeping track of it.

As a sidenote, mdb_var_alloc always allocates sizeof(mdb_var_t) + 31 if inline names are to be used - regardless of the length of the name.

#4

Updated by Josef Sipek almost 6 years ago

Here are the possible solutions...

  1. update every user of mdb_nv_insert to check the length and strdup as necessary - this is error prone and causes unnecessary code duplication
  2. change mdb_var_alloc to strdup long names when MDB_NV_EXTNAME wasn't specified, and automatically enables MDB_NV_EXTNAME
  3. change mdb_var_alloc to allow any length name inline

Of these solutions, the third one seems to be the cleanest.

The second and third are really the same, but the second splits the allocation into two and then stashes the pointer in mdb_var_t. Additionally, the second approach will also conflict with MDB_NV_OVERLOAD since the external name pointer is shared by MDB_NV_OVERLOAD and MDB_NV_EXTNAME. (The two could be decoupled and that's why I'm including it as a valid approach.)

 * A variable may have multiple definitions (v_ndef chain), but this feature
 * is mutually exclusive with MDB_NV_EXTNAME in order to save space.
#5

Updated by Josef Sipek almost 6 years ago

  • Assignee set to Josef Sipek
#6

Updated by Rich Lowe over 5 years ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100
  • Tags deleted (needs-triage)

Resolved in 065c692

Also available in: Atom PDF