Project

General

Profile

Bug #3418

DTrace may leave spurious breakpoints in user processes

Added by Rich Lowe almost 8 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
DTrace
Start date:
2012-12-13
Due date:
% Done:

0%

Estimated time:
Difficulty:
Hard
Tags:
Gerrit CR:

Description

A SmartOS user debugging a separate problem with that system discovered that his devfsadmd had dumped core at some point in the past, and was suspicious (since his problem may be related).

Unfortunately, it turns out that the crash is actually because of his investigation!

His core shows that we've crashed in a very strange way, and place:

debugging core file of devfsadm (32-bit) from 00-30-48-f3-3c-12
...
status: process terminated by SIGTRAP (Trace/Breakpoint Trap), addr=fefd2ed5

fe99e498 ld.so.1`rtld_db_dlactivity(feffc880, 3, 1, fefd2011)
fe99e4c8 ld.so.1`lm_move+0x2d(feffc880, 20, 10, feffdcb8, feffdca8, fe99e50c)
fe99e528 ld.so.1`relocate_lmc+0x10c(feffc880, 20, fef804f0, fe2008d0, fe99e5dc, 1)
fe99e598 ld.so.1`dlmopen_core+0x2c4(fe99e6cf, c01, fef804f0, 0, 0, fe99e5dc)
fe99e5f8 ld.so.1`dlmopen_intn+0x127(feffc880, fe99e6cf, 1, fef804f0, 0, 0)
fe99e628 ld.so.1`dlmopen_check+0x10a(1, fef804f0, 6374652f, 7665642f, 3d7, fe99e6f8)
fe99e668 ld.so.1`dlopen+0x46(fe99e6cf, 1, fee9236d, fef71000, fec41a40, 807aa18)

Notice how we crashed at the 0 offset in rtld_db_dlactivity. This should immediately raise your suspicions that a debugger of some nature left a turd, and indeed:

> ld.so.1`rtld_db_dlactivity/i
ld.so.1`rtld_db_dlactivity:
ld.so.1`rtld_db_dlactivity:     int    $0x3

Talking to the user involved, it appears that he has DTrace'd devfsadmd (though not using the pid provider, perhaps using -p), and truss'd it (though not using -u).

And, it turns out, that there's been a prior report from fishworks of precisely this problem (this is why I'm pointing the finger at DTrace much more strongly than truss).

Unfortunately, as yet, I have been unable to reproduce this in isolation to get any idea how we are escaping with the breakpoints intact.

I have, however, confirmed that we've actually disabled none of the breakpoints we use to synchronize with the linker

> ld.so.1`rtld_db_dlactivity/i
ld.so.1`rtld_db_dlactivity:
ld.so.1`rtld_db_dlactivity:     int    $0x3
> ld.so.1`rtld_db_preinit/i
ld.so.1`rtld_db_preinit:
ld.so.1`rtld_db_preinit:        int    $0x3
> ld.so.1`rtld_db_postinit/i
ld.so.1`rtld_db_postinit:
ld.so.1`rtld_db_postinit:       int    $0x3

Bryan, Adam, does this ring bells at all?

No data to display

Also available in: Atom PDF