Bug #9058

postmortem DTrace frequently broken under vmware

Added by Robert Mustacchi almost 3 years ago. Updated about 2 years ago.

Start date:
Due date:
% Done:


Estimated time:
Gerrit CR:


I don't know how often this happens outside of vmware but I hit this fairly often on my smartos VM under vmware. I am hitting this a lot right now, so I just stopped my VM (to try to get into the base state), started, then ran a test which still hits this problem. I have a 2 CPU VM and DTrace probes enabled when I panic. I am running a DEBUG build. I should have filed this long ago, but since I am hitting this a lot right now, I'm filing it for when someone (maybe me) has a chance to investigate.

> ::dtrace_state
            ADDR MINOR             PROC NAME                         FILE
ffffff035a577180     2 ffffff036bc24028 dtrace           ffffff03653941e0
> ffffff035a577180::dtrace
bash 103786 1 <- lx_io_cleanup
aio 103786 1 -> lx_io_destroy(1))
aio 103786 1 -> lx_io_destroy_common(1))
Assertion failed: timestamp >= dtp->dt_last_timestamp, file ../common/dt_consume.c, line 3014

*** mdb: received signal ABRT at:
    [7] mdb`dcmd_invoke+0x7c()
    [8] mdb`mdb_call_idcmd+0x112()
    [9] mdb`mdb_call+0x3e1()
    [10] mdb`yyparse+0xdb4()
    [11] mdb`mdb_run+0x2cd()
    [12] mdb`main+0xc9d()
    [13] mdb`_start+0x6c()

mdb: (c)ore dump, (q)uit, (r)ecover, or (s)top for debugger [cqrs]? 

mdb is basically dead at this point and I need to exit and start again.


If I recall correctly, from quite a long time ago when I hit something like this, this is a result of our having temporal sorting of DTrace probe firings by default. That is: each entry in the buffer has a timestamp (from hrtime? the TSC?) along with the collected data. We assert that those timestamps only go up, but on VMware either they don't always increase monotonically under some conditions, or there can be large enough differences between each of the VCPUs that the delta exceeds the buffer switch rate.


Opening the core for postmotrtem dtracing VM results in mdb crashing at the same assert in dt_consume.c:

> ::dtrace_state
            ADDR MINOR             PROC NAME                         FILE
ffffff1d5aded0c0     2 ffffff1d5af28018 dtrace           ffffff1d5d526c90
> ffffff1d5aded0c0::dtrace
vmx_run:return 19 495109155683
vmx_run:return 19 495109157080
Assertion failed: timestamp >= dtp->dt_last_timestamp, file ../common/dt_consume.c, line 3016

*** mdb: received signal ABRT at:
    [7] mdb`dcmd_invoke+0x7c()
    [8] mdb`mdb_call_idcmd+0x112()
    [9] mdb`mdb_call+0x3e1()
    [10] mdb`yyparse+0xdb4()
    [11] mdb`mdb_run+0x2cd()
    [12] mdb`main+0xc9d()
    [13] mdb`_start_crt+0x83()
    [14] mdb`_start+0x18()

I choose to take a core from this mdb session and then went home for the day.

Coming back to the problem later I opened the original core again in mdb and the behavior had changed – the bug was no longer reproducible! Trying again and again the core refused to kill mdb. At some point my machine panicked so I had a freshly restarted VM. Upon reboot I opened the core again in mdb and opened it for dtracing. mdb crashed. I tried again – crash. After some time the core no longer killed mdb and worked fine. I restarted my machine to test my theory and it checked out: the core could be traced just fine after the VM had been up for about 10 minutes.

Opening dt_consume.c in one tab and my mdb core in another I went to work:

Here is our assert and some context.

dtrace_consume(dtrace_hdl_t *dtp, FILE *fp,
    dtrace_consume_probe_f *pf, dtrace_consume_rec_f *rf, void *arg)


                        dtrace_bufdesc_t *buf = dt_pq_pop(dtp->dt_bufq);
                        uint64_t timestamp;


                        timestamp = dt_buf_oldest(buf, dtp);
                        assert(timestamp >= dtp->dt_last_timestamp);
                        dtp->dt_last_timestamp = timestamp;

[root@00-0c-29-58-5d-8d /zones/home]# mdb /zones/global/cores/core.mdb.3844
> ::stack`_lwp_kill+0xa()`raise+0x20(6)`abort+0x60()
0xfffffd7fef20107a()`dtrace_consume+0x3d7(c5e590, fffffd7fef305f40, fffffd7fee0181f0, fffffd7fee018190, fffffd7fffdfe8d0)`dtrace+0x2c4(ffffff1d5aded0c0, 1, 0, 0)
dcmd_invoke+0x7c(c49ba0, ffffff1d5aded0c0, 1, 0, 0, 0)
mdb_call_idcmd+0x112(c49ba0, ffffff1d5aded0c0, 1, 1, 518d48, 518d60)
mdb_call+0x3e1(ffffff1d5aded0c0, 1, 1)
main+0xc9d(2, fffffd7fffdffb98, fffffd7fffdffbb0)

We're interested in how timestamp is set. dt_buf_oldest takes a buf and returns a timestamp. That buf in turn is popped off a priority queue pointed to by dtp->dt_bufq. dtp is the first argument to dtrace_consume and it's address is 0xc5e590 as shown in our stacktrace.

While we're in dtrace_hdl_t we can also grab an important piece of our assert, dt_last_timestamp.

> 0xc5e590::print dtrace_hdl_t
dt_bufq = 0x11610d0
dt_last_timestamp = 0x7346ce30d8

dtp->dt_bufq contains the dtrace probe information used in postmortem dtracing, it is a priority queue (dt_pq_t) of dtrace_bufdesc_t

> 0x11610d0::print dt_pq_t
    dtpq_hdl = 0xc5e590
    dtpq_items = 0xc5d3d0
    dtpq_size = 0x200
    dtpq_last = 0x17
    dtpq_value =`dt_buf_oldest
    dtpq_arg = 0

dtpq_items points at the array of dtrace_bufdesc_t's we're after. We can print these all out at once:

> 0xc5d3d0,0x17/Kn | ::print dtrace_bufdesc_t
    dtbd_size = 0xb800
    dtbd_cpu = 0x12
    dtbd_errors = 0
    dtbd_drops = 0
    dtbd_data = 0x1259010
    dtbd_oldest = 0
    dtbd_timestamp = 0x128dd61f5f

Let's grab the largest:

> 0xc5d3d0,0x17/Kn | ::print dtrace_bufdesc_t ! grep dtbd_timestamp | awk '{ print $3 }' | sort -n

If dt_buf_oldest() is out of records it returns the time a buffer was retrieved, dtbd_timestamp. If we hit the end of a buffer and it returns dtbd_timestamp = 0x128f7799e9 we will blow our assert and crash mdb. Earlier we grabbed dt_last_timestamp = 0x7346ce30d8. dt_last_timestamp is a hrtime_t measuring the number of nanoseconds since boot. Converting to minutes we find that 0x7346ce30d8 ns is 8 minutes and 15.11 seconds, or about 10 minutes. This is the timestamp we need to pass to no longer crash mdb, explaining the mysterious behavior Patrick saw.

In the end the problem with the logic was fairly simple, we assert that timestamp is greater than or equal to the dt_last_timestamp before checking if we've hit the end of the records in a buffer. Moving the assert after the check is all that is needed to solve the issue. The problem was solved without a lot of mdb but mdb was invaluable in root causing the issue.


Updated by Electric Monk about 2 years ago

  • Status changed from New to Closed

git commit 793bd7e3617ae7d3d24e8c6b7d6befe35f07ec1f

commit  793bd7e3617ae7d3d24e8c6b7d6befe35f07ec1f
Author: Sam Gwydir <>
Date:   2019-01-11T17:09:26.000Z

    9058 postmortem DTrace frequently broken under vmware
    Reviewed by: Tim Kordas <>
    Reviewed by: Patrick Mooney <>
    Reviewed by: Yuri Pankov <>
    Reviewed by: Matthew Ahrens <>
    Approved by: Matthew Ahrens <>

Also available in: Atom PDF