Bug #11584
closed::xcall would be useful
100%
Description
http://smartos.org/bugview/OS-7124
Many times recently, we've had a crash dump where all the CPUs get stuck. Often this is most easily visible as xcall state, where a master CPU is stuck in xc_serv() waiting for one or more slave CPUs to respond.
It's somewhat painful to observe xcall state, so to make state clearer, an ::xcall dcmd would be useful. This will collate all active xc_msg structs under the relevant master CPU, making it clear for each master what other CPUs it is waiting for and why.
Currently, a CPU that's processing a message takes it off the >xc_msgbox queue, and it's hence not (easily) visible in the dump state. To make this clearer, when we start to process a message, we'll place it in an >xc_curmsg holding cell, so it's easy for ::xcall to find it.
We also need a couple of mdb_ctf.c fixed: first, re-introduce the logic that allowed a consumer to optionally ignore missing members in mdb_ctf_vread() - so we fall back if a dump is missing ->xc_curmsg. Also, don't use UM_GC in that same routine: in a loop, this can easily exhaust KMDB's available space.
Updated by Electric Monk about 4 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit a8ea0c9dd566453d9b69eab5f863930da9d0c4ae
commit a8ea0c9dd566453d9b69eab5f863930da9d0c4ae Author: John Levon <john.levon@joyent.com> Date: 2019-09-04T09:22:58.000Z 11584 ::xcall would be useful Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com> Reviewed by: Robert Mustacchi <rm@joyent.com> Reviewed by: Toomas Soome <tsoome@me.com> Approved by: Richard Lowe <richlowe@richlowe.net>