Project

General

Profile

Actions

Bug #13654

closed

upanic needs to finish auditing the syscall before auditing cores

Added by Robert Mustacchi about 2 months ago. Updated about 2 months ago.

Status:
Closed
Priority:
Normal
Category:
kernel
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

While working on unrelated bits, Dan saw a panic that had the following form:

> ::status
debugging crash dump vmcore.11 (64-bit) from larry
operating system: 5.11 joyent_20210317T051153Z (i86pc)
...
panic message: assertion failed: tad->tad_scid == 0, file: ../../common/c2/audit.c, line: 545
dump content: kernel pages only
> $C
fffffe007daaddb0 vpanic()
fffffe007daade00 0xfffffffffbdd2725()
fffffe007daade40 audit_core_start+0x200(6)
fffffe007daadea0 upanic+0x142(f7e89870, 190)
fffffe007daadf00 _sys_sysenter_post_swapgs+0x259()
> *panic_thread::print kthread_t t_audit_data[]
t_audit_data = {
    t_audit_data->tad_thread = 0xfffffe59f6793080
    t_audit_data->tad_scid = 0x7d
    t_audit_data->tad_event = 0
    t_audit_data->tad_evmod = 0
    t_audit_data->tad_ctrl = 0
    t_audit_data->tad_errjmp = 0
    t_audit_data->tad_flag = 0
    t_audit_data->tad_audit = 0x1
    t_audit_data->tad_aupath = 0
    t_audit_data->tad_atpath = 0
    t_audit_data->tad_ad = 0
    t_audit_data->tad_defer_head = 0
    t_audit_data->tad_defer_tail = 0
    t_audit_data->tad_sprivs = {
        pbits = [ 0, 0, 0, 0 ]
    }
    t_audit_data->tad_fprivs = {
        pbits = [ 0, 0, 0, 0 ]
    }
}

In this case, we blew the assertion around having a non-zero system call id. upanic(2) goes straight into the core dump path, where as other system calls will generally result in the core happening asynchronously as part of the thread return so it'll have finished auditing the system call. In other cases where the thread does not return, such as exit(2), we will terminate the system call auditing before we start the exit event or have other special handling for these. In this case, we should just have upanic() explicitly finish the system call handling because we know it'll never actually get out.

With this in hand and a system with audtiing enabled with debug bits, I was able to verify that we no longer blow the assertion and that we successfully audited the core dump.

Actions #1

Updated by Electric Monk about 2 months ago

  • Gerrit CR set to 1357
Actions #2

Updated by Electric Monk about 2 months ago

  • Status changed from New to Closed
  • % Done changed from 80 to 100

git commit 88a8a2ff3233e21e4ab8bc203109d12dc5d5a189

commit  88a8a2ff3233e21e4ab8bc203109d12dc5d5a189
Author: Robert Mustacchi <rm@fingolfin.org>
Date:   2021-03-19T20:44:28.000Z

    13654 upanic needs to finish auditing the syscall before auditing cores
    Reviewed by: Jason King <jason.king@joyent.com>
    Reviewed by: Alex Wilson <alex@cooperi.net>
    Reviewed by: Andy Fiddaman <andy@omnios.org>
    Approved by: Gordon Ross <gordon.w.ross@gmail.com>

Actions

Also available in: Atom PDF