Project

General

Profile

Bug #7412

newproc() performs inadequate clean-up after failed lwp_create()

Added by Robert Mustacchi about 3 years ago. Updated about 3 years ago.

Status:
Closed
Priority:
Normal
Category:
kernel
Start date:
2016-09-27
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:

Description

We saw a series of dumps due to corruption of the proc lists.

Starting at the top:

Bad kernel fault at addr=0x90
pid=85879, pc=0xfffffffffba06716, sp=0xfffffe426ac79cb0, eflags=0x10246
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 3426f8<smap,smep,osxsav,vmxe,xmme,fxsr,pge,mce,pae,pse,de>
cr2: 90
cr3: 1dc00000
cr8: 0

        rdi: fffffea3bf5ce0a0 rsi:                1 rdx:                0
        rcx: fffffea35943e098  r8:                0  r9:                0
        rax:                0 rbx: fffffea3e6a97030 rbp: fffffe426ac79cd0
        r10: fffffea35917fb00 r11:                2 r12:               13
        r13: fffffea3bfa0d010 r14: fffffea6bfa57e00 r15: fffffea3e0343c40
        fsb: fffffd7fff051a40 gsb: fffffea363b20b00  ds:               4b
         es:               4b  fs:                0  gs:                0
        trp:                e err:                2 rip: fffffffffba06716
         cs:               30 rfl:            10246 rsp: fffffe426ac79cb0
         ss:               38

fffffe426ac79cd0 freeproc+0x126(fffffea3e6a97030)
fffffe426ac79d10 lx_exit_with_sig+0xde(fffffea3e6a97030, fffffea6c0fe6870)
fffffe426ac79da0 proc_exit+0xaae(2, 9)
fffffe426ac79dc0 exit+0x15(2, 9)
fffffe426ac79e40 psig+0x38c()
fffffe426ac79f00 post_syscall+0x82d(1, ffffffffffffffff)
fffffe426ac79f10 0xfffffffffb800ced()

The code for this portion of freeproc:

freeproc+0x10f:                 cmpq   %rbx,%rdx
freeproc+0x112:                 je     +0x8     <freeproc+0x11c>
freeproc+0x114:                 movq   %rdx,%rax
freeproc+0x117:                 testq  %rax,%rax
freeproc+0x11a:                 jne    -0x14    <freeproc+0x108>
freeproc+0x11c:                 movq   %rax,%rdx
freeproc+0x11f:                 movq   0x90(%rbx),%rax
freeproc+0x126:                 movq   %rax,0x90(%rdx)
freeproc+0x12d:                 movq   0x10(%rbx),%rdi
freeproc+0x131:                 call   -0x1a4f16        <mutex_enter>

It corresponds to this source:

                for (q = q->p_orphan; q; q = q->p_nextorph)
                        if (q->p_nextorph == p)
                                break;
                ASSERT(q && q->p_nextorph == p);
                q->p_nextorph = p->p_nextorph;
        }

        /*
         * The process table slot is being freed, so it is now safe to give up
         * task and project membership.
         */
        mutex_enter(&p->p_lock);

Specifically, it is the "q->p_nextorph = p->p_nextorph;" assignment where things went wrong.
If we manually walk the orphans of p->p_nextofkin, we find that p is indeed missing:

> fffffea3e6a97030::print proc_t p_nextofkin
p_nextofkin = 0xfffffea3bfa0d010
> fffffea3bfa0d010::print proc_t p_orphan
p_orphan = 0xfffffea3ec2f2010
> fffffea3bfa0d010::print proc_t p_orphan | ::list proc_t p_nextorph
fffffea3ec2f2010
fffffea3e4268040
fffffea3e85bc000
fffffea3fd90f010
fffffea3e537e040
fffffea3ebfae038
fffffea3fd914050
fffffea3e5d4b000
fffffea3ed601048
fffffea3ed3c2020
fffffea3efdb3050
fffffea394c18048
fffffea3e87bf008
fffffea3df7a6028
fffffea3e01db040
fffffea3e417a008
fffffea3e55ac008
fffffea3e357f048
fffffea3eeab3040
fffffea3fd940028
fffffea39438e028
fffffea3947a5000
> fffffea3bfa0d010::print proc_t p_orphan | ::list proc_t p_nextorph ! grep fffffea3e6a97030

On a DEBUG kernel, this would have blown the ASSERT prior to the fateful assignment.
It is not yet clear how this came to pass.

It appears that the culprit here is a failure on the part of newproc() to adequately clean up after itself when handling failure from lwp_create(). A common series of events to trigger this are as follows:
1. An LX zone is at its max_lwps limit
2. A user in the GZ attempts to zlogin to the zone
3. Since zlogin does not pay heed to the max_lwps limit for the initial login process (which is created in the GZ and then zone_enter()ed into the zone), it is able to run /bin/login
4. The login process will attempt to fork something else and fail. (Things are fine at this point)
5. The login process, detecting this failure, will call exit(2).
6. As part of proc_exit(), the lwp_exit hook for cgroupfs is called when the LWP is unbranded.
7. The process, having undergone no cgroup manipulation, will be the sole member of the '0' cgroup. This will cause the release agent to be run
8. The release agent is spawned by newproc(), to appear as if it were forked from init (sans resource inheritance). Since the zone is at its lwp limit, lwp_create will fail in newproc().

It is here that we find the problem. The process clean-up in newproc() is inadequate and will leave pointers dangling all over the place:

        if ((lwp = lwp_create(pc, arg, 0, p, TS_STOPPED, pri,
            &curthread->t_hold, cid, 1)) == NULL) {
            task_t *tk;
            fork_fail(p);
            mutex_enter(&pidlock);
            mutex_enter(&p->p_lock);
            tk = p->p_task;
            task_detach(p);
            ASSERT(p->p_pool->pool_ref > 0);
            atomic_add_32(&p->p_pool->pool_ref, -1);
            mutex_exit(&p->p_lock);
            pid_exit(p, tk);
            mutex_exit(&pidlock);
            task_rele(tk);

            return (EAGAIN);
        }

This block lacks the logic to undo the child/parent/orphan relationships setup by getproc. On exit from the cgroupfs callback, the proc hierarchy is corrupted and the system is likely to crash.

History

#1

Updated by Electric Monk about 3 years ago

  • Status changed from New to Closed

git commit 6aca388dc960a6bbce180ed28e5474f6ee850be8

commit  6aca388dc960a6bbce180ed28e5474f6ee850be8
Author: Patrick Mooney <pmooney@pfmooney.com>
Date:   2016-10-12T17:15:31.000Z

    7412 newproc() performs inadequate clean-up after failed lwp_create()
    Reviewed by: Robert Mustacchi <rm@joyent.com>
    Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
    Reviewed by: Dan McDonald <danmcd@omniti.com>
    Reviewed by: Gordon Ross <gwr@nexenta.com>

Also available in: Atom PDF