Project

General

Profile

Bug #5780

Truncated coredumps

Added by Simon Klinkert over 4 years ago. Updated about 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
kernel
Start date:
2015-04-02
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

I’m debugging a problem with truncated coredumps caused by my application. It turns out, that my application opens a file descriptor within a snapshot (.zfs/snapshot/…), then destroys the snapshot, and calls abort(). The core function fails with 1 and the created coredump is corrupt:

Log message: genunix: [ID 457380 kern.notice] NOTICE: core_log: corebug[2870] core dump failed, errno=5

# pstack /var/tmp/core.corebug.847
pstack: cannot examine /var/tmp/core.corebug.847: core file is corrupt or missing required data

# ls -la /var/tmp/core.bcorebug.847
-rw-------   1 root     root       83512 Mar 17 11:28 /var/tmp/core.bcorebug.847

I think this shouldn’t happen. I wrote a little reproducer (see attachment):

# dtrace -n 'core:return {trace(arg1);} zfs_getattr:return, zfs_lookup:return/arg1==5/{trace(arg1); stack();}'
dtrace: description 'core:return ' matched 3 probes
CPU     ID                    FUNCTION:NAME
 11  39346                zfs_lookup:return                 5
              genunix`fop_lookup+0xa2
              genunix`dirtopath+0x314
              genunix`vnodetopath_common+0x3cb
              genunix`vnodetopath+0x24
              elfexec`write_elfnotes32+0x64b
              elfexec`elf32core+0x9de
              genunix`do_core+0x18a
              genunix`dump_one_core+0x139
              genunix`core+0x360
              genunix`psig+0x594
              genunix`post_syscall+0x82d
              genunix`syscall_exit+0x68
              unix`0xfffffffffb800ed9

 11  39360               zfs_getattr:return                 5
              genunix`fop_getattr+0xa8
              elfexec`write_elfnotes32+0x662
              elfexec`elf32core+0x9de
              genunix`do_core+0x18a
              genunix`dump_one_core+0x139
              genunix`core+0x360
              genunix`psig+0x594
              genunix`post_syscall+0x82d
              genunix`syscall_exit+0x68
              unix`0xfffffffffb800ed9

 11  19157                      core:return                 1

write_elfnotes() iterates over all fds and gets an EIO. zfs_lookup() immediately returns EIO because z_sa_hdl is NULL. Maybe the fix is to continue the fd loop in write_elfnotes() instead of leaving the function:

http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/exec/elf/elf_notes.c#383


Files

corebug.c (1.41 KB) corebug.c Reproducer Simon Klinkert, 2015-04-02 12:31 PM

Related issues

Related to illumos gate - Bug #6320: Garbage flag for core dumpsNew2015-10-12

Actions
Related to illumos gate - Bug #7307: Fixing 5780 introduced a regressionClosed2016-08-16

Actions

History

#1

Updated by Matthew Ahrens about 4 years ago

  • Subject changed from Truncated coredumps due to ZFS destroy to Truncated coredumps
#2

Updated by Simon Klinkert about 4 years ago

  • Related to Bug #6320: Garbage flag for core dumps added
#3

Updated by Electric Monk about 4 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit 33d794d10eebfa2727ac1fc98fe0dd6c68f627dc

commit  33d794d10eebfa2727ac1fc98fe0dd6c68f627dc
Author: Simon Klinkert <simon.klinkert@gmail.com>
Date:   2015-10-19T14:16:12.000Z

    5780 Truncated coredumps
    Reviewed by: Robert Mustacchi <rm@joyent.com>
    Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
    Approved by: Dan McDonald <danmcd@omniti.com>

#4

Updated by Electric Monk about 4 years ago

git commit f2f1e74250739faac0cdf175c8a7ae4480770789

commit  f2f1e74250739faac0cdf175c8a7ae4480770789
Author: Dan McDonald <danmcd@omniti.com>
Date:   2015-10-19T18:17:06.000Z

    5780 Truncated coredumps (fix lint)

#5

Updated by Simon Klinkert about 3 years ago

  • Related to Bug #7307: Fixing 5780 introduced a regression added

Also available in: Atom PDF