Project

General

Profile

Bug #9055

panic in prgetattr

Added by Robert Mustacchi over 1 year ago. Updated about 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
kernel
Start date:
2018-02-06
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:

Description

> ::status
debugging crash dump /var/tmp/SCI-315.d/vmcore.0 (64-bit) from HC5T14TD2
operating system: 5.11 joyent_20171005T144847Z (i86pc)

> ::stack
prgetattr+0x45e(ffffd07d5a6f6580, ffffd001f0ccdd40, 0, ffffd1ca570e4718, 0)
fop_getattr+0xa8(ffffd07d5a6f6580, ffffd001f0ccdd40, 0, ffffd1ca570e4718, 0)
lx_stat_common+0x61(ffffd07d5a6f6580, ffffd1ca570e4718, 7fffffeff020, 2, 0)
lx_lstat64+0x76(18c89, 7fffffeff020)
lx_syscall_enter+0x16f()
sys_syscall+0x142()

Disassembling prgetattr around this location, we have

prgetattr+0x455:                movq   0x30(%r12),%rax
prgetattr+0x45a:                movq   0x28(%rax),%rax
prgetattr+0x45e:                movq   0x188(%rax),%rax
prgetattr+0x465:                cmpq   $0x0,0x3a0(%rax)
prgetattr+0x46d:                je     -0x2d9   <prgetattr+0x19a>

Looking at the registers we see that %rax is 0. Here is the incoming vnode and prnode.

> ffffd07d5a6f6580::print vnode_t v_data
v_data = 0xffffd06688a050a8
> ffffd06688a050a8::print prnode_t
{
    pr_next = 0xffffd07dca497140
    pr_flags = 0
    pr_mutex = {
        _opaque = [ 0 ]
    }
    pr_type = 0t36 (PR_SPYMASTER)
    pr_mode = 0x100
    pr_ino = 0
    pr_hatid = 0
    pr_common = 0xffffd068cf8de1a8
    pr_pcommon = 0xffffd06a760a64f8
    pr_parent = 0xffffd0666d021200
    pr_files = 0
    pr_index = 0x6
    pr_pidfile = 0
    pr_realvp = 0
    pr_owner = 0
    pr_vnode = 0xffffd07d5a6f6580
    pr_contract = 0
    pr_cttype = 0
}

This is the code block for the spymaster handling

        case PR_SPYMASTER:
                if (pnp->pr_common->prc_thread->t_lwp->lwp_spymaster != NULL) {
                        vap->va_size = PR_OBJSIZE(psinfo32_t, psinfo_t);
                } else {
                        vap->va_size = 0;
                }

Here is pr_common

> ffffd068cf8de1a8::print prcommon_t
{
    prc_mutex = {
        _opaque = [ 0 ]
    }
    prc_wait = {
        _opaque = 0
    }
    prc_flags = 0x3
    prc_writers = 0
    prc_selfopens = 0
    prc_pid = 0x11398
    prc_datamodel = 0x200000
    prc_proc = 0xffffd07dcd4d3010
    prc_thread = 0
    prc_slot = 0x77
    prc_tid = 0x9d
    prc_tslot = 0xffffffff
    prc_refcnt = 0x1
    prc_pollhead = {
        ph_list = 0
        ph_pad1 = 0
        ph_pad2 = 0
    }
}

Since prc_thread is NULL here, we can see why we paniced.

Additional data on the victim process

ffffd07dcd4d3010::ps -t
S    PID   PPID   PGID    SID    UID      FLAGS             ADDR NAME
R  70552  69463  70552  70552    995 0x4a004000 ffffd07dcd4d3010 java

The java process has 467 threads (which I'm not showing here).

Although the window is small, there is a possible race between getting the prnode_t and locking the process. During that small window the thread could exit. The other places in the code which look through prc_thread validate that it is not null before referencing through the pointer. This appears to only be a bug with the spymaster node.


Related issues

Related to illumos gate - Feature #3670: add visibility into agent LWP's spymasterResolved2013-04-02

Actions

History

#1

Updated by Robert Mustacchi over 1 year ago

  • Related to Feature #3670: add visibility into agent LWP's spymaster added
#2

Updated by Electric Monk about 1 year ago

  • Status changed from New to Closed

git commit 614f1d633e921143ad22010eeec64ed7c6aa627c

commit  614f1d633e921143ad22010eeec64ed7c6aa627c
Author: Jerry Jelinek <jerry.jelinek@joyent.com>
Date:   2018-07-19T19:28:30.000Z

    9055 panic in prgetattr
    Reviewed by: Jason King <jason.king@joyent.com>
    Reviewed by: Rich Lowe <richlowe@richlowe.net>
    Approved by: Dan McDonald <danmcd@joyent.com>

Also available in: Atom PDF