Project

General

Profile

Actions

Bug #14475

closed

Recursive death in libfakekernel assfail after 12396

Added by Gordon Ross 6 months ago. Updated 6 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
lib - userland libraries
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Bite-size
Tags:
Gerrit CR:

Description

Running "testoplock" with an assertion failure in play,
get thousands of lines of this, ending with:

08046ab8 libfakekernel.so.1`fm_panic(fef58d58)
08046ad8 libfakekernel.so.1`assfail+0x34(fef69760, fef58d4e, 0)
08046b08 libfakekernel.so.1`vpanic+0x94(fef58d58, 8046b44)
08046b38 libfakekernel.so.1`fm_panic(fef58d58)
08046b58 libfakekernel.so.1`assfail+0x34(fef69760, fef58d4e, 0)
08046b88 libfakekernel.so.1`vpanic+0x94(fef58d58, 8046bc4)
08046bb8 libfakekernel.so.1`fm_panic(fef58d58)
08046bd8 libfakekernel.so.1`assfail+0x34(fef69760, fef58d4e, 0)
08046c08 libfakekernel.so.1`vpanic+0x94(fef58d58, 8046c44)
08046c38 libfakekernel.so.1`fm_panic(fef58d58)
08046c58 libfakekernel.so.1`assfail+0x34(8056b6c, 8056afc, e0f)
08046c88 smb_oplock_move+0x85(806e980, 806ec30, 806eba8)
08046cb8 do_move+0x8f(2, 806e7a7)
08046d08 main+0x403(8046d0c, fef135c8)
08046d48 _start_crt+0x9a(1, 8046d74, fefcffef, 0, 0, 0)
08046d68 _start+0x1a(1, 8046fe0, 0, 8047021, 804706b, 804708c)


Related issues

Related to illumos gate - Bug #12396: zdb -A does not work after 3006ClosedYuri Pankov

Actions
Actions #1

Updated by Electric Monk 6 months ago

  • Gerrit CR set to 2007
Actions #2

Updated by Gordon Ross 6 months ago

  • Related to Bug #12396: zdb -A does not work after 3006 added
Actions #3

Updated by Gordon Ross 6 months ago

The problem, of course, is that after #12396 added assfail (and assfail3) to libfakekernel,
the calls to panic in assfail recurse into assfail via vpanic.
To fix this, I'll make vpanic call (libc)upanic instead.

BTW, I'm not sure why that stack shows fm_panic. The function actually called was panic, but fm_panic is immediately after it.
I tried moving things around, and it seems the track backtrace is confused by these functions, and shows the PC as being in the following function for these cases. I wonder why. Not this bug anyway.

Actions #4

Updated by Gordon Ross 6 months ago

After the fix, looks fine:

gwr@oi-work$ mdb testoplock core
Loading modules: [ libumem.so.1 libc.so.1 ld.so.1 ]
> ::status
debugging core file of testoplock (32-bit) from oi-work
file: testoplock
initial argv: /tank/ws/illumos-rti/proto/root_i386/usr/lib/smbsrv/testoplock
threading model: native threads
status: process panicked
upanic message: assertion failed: MUTEX_HELD(&node->n_oplock.ol_mutex), file: ../../../uts/common/fs/smbsrv/smb_cmn_oplock.c, line: 3544
> $C
08046bd8 libc.so.1`syscall+0x13(fef6a760, 200, fef58d90, 8046c44, fef571d2, 8046c44)
08046c08 libfakekernel.so.1`panic(fef58d90)
08046c38 libfakekernel.so.1`fm_panic(fef58d90)
08046c58 libfakekernel.so.1`assfail+0x34(805692c, 80568bc, dd8)
08046c98 smb_oplock_move+0x85(806e980, 806ec38, 806ebac)
08046cc8 do_move+0x8f(2, 806e7a7)
08046d18 main+0x403(8046d1c, fef135c8)
08046d58 _start_crt+0x9a(1, 8046d7c, fefcffef, 0, 0, 0)
08046d70 _start+0x1a(1, 8046ff0, 0, 804702f, 8047075, 8047094)

Actions #5

Updated by Rich Lowe 6 months ago

We also have upanic(2) which you might like?

Actions #6

Updated by Gordon Ross 6 months ago

  • Status changed from New to In Progress
Actions #7

Updated by Electric Monk 6 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

git commit 0a34963c38fe21eee84ebab010996317731a5171

commit  0a34963c38fe21eee84ebab010996317731a5171
Author: Gordon Ross <gwr@racktopsystems.com>
Date:   2022-02-15T23:46:29.000Z

    14475 Recursive death in libfakekernel assfail after 12396
    Reviewed by: Toomas Soome <tsoome@me.com>
    Reviewed by: Matt Barden <mbarden@tintri.com>
    Approved by: Dan McDonald <danmcd@joyent.com>

Actions

Also available in: Atom PDF