Project

General

Profile

Bug #12895

zfs_onexit_fd_hold fails to release non-zfs fds

Added by Jason King 4 months ago. Updated 4 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
zfs - Zettabyte File System
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

While testing #12877, I encountered the following panic:

panic[cpu0]/thread=fffffe171614cc40:
assertion failed: afd->a_fd[i] == -1, file: ../../common/os/fio.c, line: 465

fffffe001f3c7de0 genunix:process_type+199d2d ()
fffffe001f3c7e10 genunix:clear_stale_fd+53 ()
fffffe001f3c7eb0 genunix:post_syscall+241 ()
fffffe001f3c7ef0 genunix:syscall_exit+68 ()
fffffe001f3c7f00 unix:brand_sys_syscall32+357 ()

Using truss, I determined this was occurring during the ZFS_IOC_HOLD call and was returning EBADF immediately before panic.

After looking at the source, I believe what's happening is that the ZFS_IOC_HOLD call can take an optional cleanup_fd parameter. The cleanup fd should be a fd opened on the zfs device.

Adding a bit of dtrace:

# dtrace -n 'zfs_onexit_fd_hold:entry { trace(arg0) } zfs_onexit_minor_to_state:return { trace(arg1) }'

And examining the output from the resulting panic:

> ::dtrace_state
            ADDR MINOR             PROC NAME                         FILE
fffffe1719f99a80     2 fffffe1783b0e038 dtrace           fffffe1728f58a08
> fffffe1719f99a80::dtrace
CPU     ID                    FUNCTION:NAME
  0  39064         zfs_onexit_fd_hold:entry                 4
  0  37917 zfs_onexit_minor_to_state:return                 9
>

9 being EBADF confirms this.

Looking at the source, zfs_onexit_fd_hold calls getf() on the cleanup fd, but if zfs_onexit_minor_to_state() fails, it never calls releasef(), which almost certainly looks to be the cause of the panic.


Related issues

Related to illumos gate - Bug #12877: Port OpenZFS #7780 - Add basic zfs ioc input nvpair validationClosed

Actions

History

#1

Updated by Electric Monk 4 months ago

  • Gerrit CR set to 773
#2

Updated by Jason King 4 months ago

I tested this with #12877 by running the zfs test suite (all expected test pass). Without this fix, the new test added by #12895 triggers the ASSERT described in the synopsis on DEBUG kernels. With this fix applied, the test succeeds (and no panic).

#3

Updated by Dan McDonald 4 months ago

  • Related to Bug #12877: Port OpenZFS #7780 - Add basic zfs ioc input nvpair validation added
#4

Updated by Electric Monk 4 months ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit 7ac89354c798225fea6296348415955ccd95fb80

commit  7ac89354c798225fea6296348415955ccd95fb80
Author: Don Brady <don.brady@delphix.com>
Date:   2020-07-07T17:55:10.000Z

    12877 Port OpenZFS #7780 - Add basic zfs ioc input nvpair validation
    12895 zfs_onexit_fd_hold fails to release non-zfs fds
    Portions contributed by: Brian Behlendorf <behlendorf1@llnl.gov>
    Portions contributed by: George Wilson <george.wilson@delphix.com>
    Portions contributed by: Simon Klinkert <simon.klinkert@gmail.com>
    Portions contributed by: Jason King <jason.king@joyent.com>
    Reviewed by: Matthew Ahrens <mahrens@delphix.com>
    Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
    Reviewed by: Toomas Soome <tsoome@me.com>
    Approved by: Dan McDonald <danmcd@joyent.com>

Also available in: Atom PDF