Project

General

Profile

Bug #7780

mdb could extract NT_PRPSINFO information from core files

Added by David Pacheco almost 3 years ago. Updated 7 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Start date:
2017-01-18
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

There's a bunch of useful information in the NT_PRPSINFO ELF note that we include in core files. This includes the uid, gid, pid, ppid, and start time of the process.

There are also per-thread notes that may have useful information.

History

#1

Updated by David Pacheco 8 months ago

It was suggested I be a bit more specific. Here's what I'd suggest:

> ::psinfo
PID:    75594  (process id)              UID:     1000  (real user id)
PPID:   73149  (parent process id)       EUID:    1000  (effective user id)
PGID:   73034  (process group id)        GID:        1  (real group id)
SID:    72056  (session id)              EGID:       1  (effective group id)

START 2019-02-11T22:13:58.967544779Z (wall timestamp when the process started)
TIME           0.316830101 seconds   (CPU time used by this process)
CTIME          0.000000000 seconds   (CPU time used by child processes)
FNAME  "node"                        (name of the program executed)
PSARGS "node tst.postmortem_jsstack.js -S" 

I left 6 digits for pids, uids, and gids; and 8 digits' worth of seconds in the various time columns (allowing for processes running just over 30 years without wrapping).

That corresponds to this data (printed from a real core file using elfdump):

Note Section:  .note(phdr)

  entry [0] 
    namesz: 0x5 
    descsz: 0x104
    type:   [ NT_PRPSINFO ]
    name:
        CORE\0
    desc: (prpsinfo_t)
        pr_state:   4                    pr_sname:    T   
        pr_zomb:    0                    pr_nice:     20  
        pr_flag:    0   
        pr_uid:     1000                 pr_gid:      1   
        pr_pid:     75594                pr_ppid:     73149
        pr_pgrp:    73034                pr_sid:      72056
        pr_addr:    0x26bbf010           pr_size:     0x73e4
        pr_rssize:  0x4e6c               pr_wchan:    0   
        pr_start:
            tv_sec: 1549923238           tv_nsec:     967544779
        pr_time:
            tv_sec: 0                    tv_nsec:     316830101
        pr_pri:     43                   pr_oldpri:   56  
        pr_cpu:     0   
        pr_ottydev: 1537                 pr_lttydev:  1572865
        pr_clname:  FSS 
        pr_fname:   node
        pr_psargs:  node tst.postmortem_jsstack.js -S
        pr_syscall: [ getloadavg ]
        pr_ctime:
            tv_sec: 0                    tv_nsec:     0   
        pr_bysize:  0x73e4000            pr_byrssize: 0x4e6c000
        pr_argc:    3                    pr_argv:     0x08047974
        pr_envp:    0x08047984           pr_wstat:    0   
        pr_pctcpu:  0.3%                 pr_pctmem:   0.1%
        pr_euid:    1000                 pr_egid:     1   
        pr_aslwpid: 0
        pr_dmodel:  [ PR_MODEL_ILP32 ]

It might also be nice to have a "raw" or "verbose" view that shows the whole structure similar to the way elfdump does.

I'm open to other ideas. It would be nice if it were more parseable, but I think that would require one item per line, which would waste a lot of space for the interactive case.

Other ideas

The fields I've most often wanted are the pid, parent pid, and start time. However, I can also see these fields being pretty useful (comments from sys/procfs.h):

    pr_pid;         /* unique process id */
    pr_ppid;        /* process id of parent */
    pr_pgid;        /* pid of process group leader */
    pr_sid;         /* session id */
    pr_uid;         /* real user id */
    pr_euid;        /* effective user id */
    pr_gid;         /* real group id */
    pr_egid;        /* effective group id */
    pr_start;   /* process start time, from the epoch */
    pr_time;    /* usr+sys cpu time for this process */
    pr_ctime;   /* usr+sys cpu time for reaped children */
    pr_fname[PRFNSZ];       /* name of execed file */
    pr_psargs[PRARGSZ];     /* initial characters of arg list */
    pr_dmodel;      /* data model of the process */
    pr_taskid;     /* task id */
    pr_projid;     /* project id */
    pr_poolid;     /* pool id */
    pr_zoneid;     /* zone id */
    pr_contract;    /* process contract */

This is another format I considered:

::psinfo
   PID   PPID   PGID    SID    UID   EUID    GID   EGID 
 75594  73149  73034  72056   1000   1000      1      1   

START TIME                        CPU TIME (process)  CPU TIME (children)
2019-02-11T22:13:58.967544779Z          0.316830101s         0.000000000s

filename:     "node" 
initial args: "node tst.postmortem_jsstack.js -S" 

But it's a lot less explicit without being much more concise.

#2

Updated by Carlos Neira 8 months ago

Hi David,

This what I have at the moment, I have add a summary section that contains siginfo information when it's present.
I'll work on adding the zone related that also, let me know how this looks.

/build/illumos-omnios/proto/root_i386-nd/usr/bin/amd64/mdb /home/cneira/core      
Loading modules: [ libc.so.1 ld.so.1 ]

> ::psinfo raw
[ NT_PSINFO ]
        pr_state:   6                   pr_sname:   O
        pr_zomb:    0                   pr_nice:    20
        pr_uid:     100                 pr_gid:     1
        pr_pid:     383885              pr_ppid:    340440
        pr_pgid:    1                   pr_sid:     340440
        pr_addr:    0x00000000          pr_size:    0x6d0
        pr_rssize:  0x38c               pr_wchan:   0
        pr_start:
            tv_sec: 1549649231          tv_nsec:    683629550
        pr_time:
            tv_sec: 0                   tv_nsec:    1054142
        pr_pri:     59                  pr_oldpri:  40
        pr_cpu:     0
        pr_clname:  TS
        pr_fname:   test
        pr_psargs:  ./test
        pr_syscall: [ SYS#0 ]
        pr_ctime:
            tv_sec: 0                   tv_nsec:    0
        pr_argc:    1                   pr_argv:    0x08039654
        pr_envp:    0x0803965c          pr_wstat:   139
        pr_pctcpu:  0.0%                pr_pctmem:  0.0%
        pr_euid:    100                 pr_egid:    1
        pr_dmodel:  [PR_MODEL_ILP32]
> ::psinfo sum
PID:    383885  (process id)            UID:      100  (real user id)
PPID:   340440  (parent process id)     EUID:     100  (effective user id)
PGID:   383885  (process group id)      GID:        1  (real group id)
SID:    340440  (session id)            EGID:       1  (effective group id)

START: 2019 Feb  8 15:07:11 (wall timestamp when the process started)
TIME:           0 seconds   (CPU time used by this process)
CTIME:          0 seconds   (CPU time used by child processes)
FNAME:  "test"      (name of the program executed)
PSARGS: "./test" 

Summary:
        Received signal SEGV_MAPERR at address=0x8051000
        Last syscall: [ SYS#0 ]

cneira@Trixie:...illumos-omnios/usr/src/cmd/mdb$   /build/illumos-omnios/proto/root_i386-nd/usr/bin/amd64/mdb /home/cneira/core.26853
Loading modules: [ libc.so.1 ld.so.1 ]
> ::psinfo raw
[ NT_PSINFO ]
        pr_state:   4                   pr_sname:   T
        pr_zomb:    0                   pr_nice:    20
        pr_uid:     100                 pr_gid:     1
        pr_pid:     26853               pr_ppid:    891
        pr_pgid:    1                   pr_sid:     891
        pr_addr:    0xfffffe12cc0b0070  pr_size:    0x1b88
        pr_rssize:  0xf64               pr_wchan:   0
        pr_start:
            tv_sec: 1548876226          tv_nsec:    683326865
        pr_time:
            tv_sec: 0                   tv_nsec:    9875073
        pr_pri:     59                  pr_oldpri:  40
        pr_cpu:     0
        pr_clname:  TS
        pr_fname:   vim
        pr_psargs:  vim t.txt
        pr_syscall: [ pollsys ]
        pr_ctime:
            tv_sec: 0                   tv_nsec:    0
        pr_argc:    2                   pr_argv:    0xfffffc7fffdf5c48
        pr_envp:    0xfffffc7fffdf5c60  pr_wstat:   0
        pr_pctcpu:  0.0%                pr_pctmem:  0.0%
        pr_euid:    100                 pr_egid:    1
        pr_dmodel:  [PR_MODEL_LP64]
> ::psinfo sum
PID:     26853  (process id)            UID:      100  (real user id)
PPID:      891  (parent process id)     EUID:     100  (effective user id)
PGID:    26853  (process group id)      GID:        1  (real group id)
SID:       891  (session id)            EGID:       1  (effective group id)

START: 2019 Jan 30 16:23:46 (wall timestamp when the process started)
TIME:           0 seconds   (CPU time used by this process)
CTIME:          0 seconds   (CPU time used by child processes)
FNAME:  "vim"      (name of the program executed)
PSARGS: "vim t.txt" 

Summary:
        Last syscall: [ pollsys ]
>

David Pacheco wrote:

It was suggested I be a bit more specific. Here's what I'd suggest:

[...]

I left 6 digits for pids, uids, and gids; and 8 digits' worth of seconds in the various time columns (allowing for processes running just over 30 years without wrapping).

That corresponds to this data (printed from a real core file using elfdump):

[...]

It might also be nice to have a "raw" or "verbose" view that shows the whole structure similar to the way elfdump does.

I'm open to other ideas. It would be nice if it were more parseable, but I think that would require one item per line, which would waste a lot of space for the interactive case.

Other ideas

The fields I've most often wanted are the pid, parent pid, and start time. However, I can also see these fields being pretty useful (comments from sys/procfs.h):

[...]

This is another format I considered:

[...]

But it's a lot less explicit without being much more concise.

#3

Updated by Carlos Neira 8 months ago

Latest changes

> ::psinfo sum
PID:     26853  (process id)            UID:      100  (real user id)
PPID:      891  (parent process id)     EUID:     100  (effective user id)
PGID:    26853  (process group id)      GID:        1  (real group id)
SID:       891  (session id)            EGID:       1  (effective group id)
ZONEID:      0                          CONTRACT:  78
PROJECT:     3                          TASK:      69

START: 2019 Jan 30 16:23:46 (wall timestamp when the process started)
TIME:          0.000000009 seconds   (CPU time used by this process)
CTIME:         0.000000000 seconds   (CPU time used by child processes)
FNAME:  "vim"                        (name of the program executed)
PSARGS: "vim t.txt" 
> ::psinfo raw
[ NT_PSINFO ]
        pr_state:   4                   pr_sname:   T
        pr_zomb:    0                   pr_nice:    20
        pr_uid:     100                 pr_gid:     1
        pr_pid:     26853               pr_ppid:    891
        pr_pgid:    1                   pr_sid:     891
        pr_addr:    0xfffffe12cc0b0070  pr_size:    0x1b88
        pr_rssize:  0xf64               pr_wchan:   0
        pr_start:
            tv_sec: 1548876226          tv_nsec:    683326865
        pr_time:
            tv_sec: 0                   tv_nsec:    9875073
        pr_pri:     59                  pr_oldpri:  40
        pr_cpu:     0
        pr_clname:  TS
        pr_fname:   vim
        pr_psargs:  vim t.txt
        pr_syscall: [ pollsys ]
        pr_ctime:
            tv_sec: 0                   tv_nsec:    0
        pr_argc:    2                   pr_argv:    0xfffffc7fffdf5c48
        pr_envp:    0xfffffc7fffdf5c60  pr_wstat:   0
        pr_pctcpu:  0.0%                pr_pctmem:  0.0%
        pr_euid:    100                 pr_egid:    1
        pr_dmodel:  [PR_MODEL_LP64]
>
                                                                                                                                cneira@Trixie:...illumos-omnios/usr/src/cmd/mdb$   /build/illumos-omnios/proto/root_i386-nd/usr/bin/amd64/mdb /home/cneira/core Loading modules: [ libc.so.1 ld.so.1 ]
> ::psinfo sum
PID:    383885  (process id)            UID:      100  (real user id)
PPID:   340440  (parent process id)     EUID:     100  (effective user id)
PGID:   383885  (process group id)      GID:        1  (real group id)
SID:    340440  (session id)            EGID:       1  (effective group id)
ZONEID:      0                          CONTRACT:  94
PROJECT:     3                          TASK:      87

START: 2019 Feb  8 15:07:11 (wall timestamp when the process started)
TIME:          0.000000001 seconds   (CPU time used by this process)
CTIME:         0.000000000 seconds   (CPU time used by child processes)
FNAME:  "test"                       (name of the program executed)
PSARGS: "./test" 
        Received signal SEGV_MAPERR at address=0x8051000
> ::psinfo raw
[ NT_PSINFO ]
        pr_state:   6                   pr_sname:   O
        pr_zomb:    0                   pr_nice:    20
        pr_uid:     100                 pr_gid:     1
        pr_pid:     383885              pr_ppid:    340440
        pr_pgid:    1                   pr_sid:     340440
        pr_addr:    0x00000000          pr_size:    0x6d0
        pr_rssize:  0x38c               pr_wchan:   0
        pr_start:
            tv_sec: 1549649231          tv_nsec:    683629550
        pr_time:
            tv_sec: 0                   tv_nsec:    1054142
        pr_pri:     59                  pr_oldpri:  40
        pr_cpu:     0
        pr_clname:  TS
        pr_fname:   test
        pr_psargs:  ./test
        pr_syscall: [ SYS#0 ]
        pr_ctime:
            tv_sec: 0                   tv_nsec:    0
        pr_argc:    1                   pr_argv:    0x08039654
        pr_envp:    0x0803965c          pr_wstat:   139
        pr_pctcpu:  0.0%                pr_pctmem:  0.0%
        pr_euid:    100                 pr_egid:    1
        pr_dmodel:  [PR_MODEL_ILP32]
>

Carlos Neira wrote:

Hi David,

This what I have at the moment, I have add a summary section that contains siginfo information when it's present.
I'll work on adding the zone related that also, let me know how this looks.
[...]

David Pacheco wrote:

It was suggested I be a bit more specific. Here's what I'd suggest:

[...]

I left 6 digits for pids, uids, and gids; and 8 digits' worth of seconds in the various time columns (allowing for processes running just over 30 years without wrapping).

That corresponds to this data (printed from a real core file using elfdump):

[...]

It might also be nice to have a "raw" or "verbose" view that shows the whole structure similar to the way elfdump does.

I'm open to other ideas. It would be nice if it were more parseable, but I think that would require one item per line, which would waste a lot of space for the interactive case.

Other ideas

The fields I've most often wanted are the pid, parent pid, and start time. However, I can also see these fields being pretty useful (comments from sys/procfs.h):

[...]

This is another format I considered:

[...]

But it's a lot less explicit without being much more concise.

#4

Updated by Electric Monk 7 months ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit 542a7b7f5ccc44e3c95d6dce4ec0566f60bd9ff4

commit  542a7b7f5ccc44e3c95d6dce4ec0566f60bd9ff4
Author: Carlos Neira <cneirabustos@gmail.com>
Date:   2019-03-14T21:16:28.000Z

    7780 mdb could extract NT_PRPSINFO information from core files
    Reviewed by: Robert Mustacchi <rm@joyent.com>
    Reviewed by: Gergo Doma <domag02@gmail.com>
    Approved by: Richard Lowe <richlowe@richlowe.net>

Also available in: Atom PDF