Project

General

Profile

Bug #6499

panic trying to write to the mountpoint of FS with nbmand=on

Added by Yuri Pankov almost 4 years ago. Updated about 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Start date:
2015-12-04
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

Getting the following panic trying to write (using echo, dd, ...) to the mounpoint (itself, not a file or dir inside) of the FS with nbmand=on:

> ::panicinfo
             cpu                5
          thread ffffff02f0136840
         message assertion failed: in_crit == 0, file: ../../common/fs/vnode.c, line: 1494
             rdi fffffffffbf592a8
             rsi ffffff00109d98d0
             rdx fffffffffbf46646
             rcx              5d6
              r8               80
              r9 ffffff00109d9c78
             rax ffffff00109d98f0
             rbx ffffff030e7c6900
             rbp ffffff00109d9930
             r10                1
             r11                2
             r12 ffffff00109d9c78
             r13                1
             r14 ffffff00109d9aa0
             r15                0
          fsbase                0
          gsbase ffffff02ee350580
              ds               4b
              es               4b
              fs                0
              gs              1c3
          trapno                0
             err                0
             rip fffffffffb86c1c0
              cs               30
          rflags              282
             rsp ffffff00109d98c8
              ss               38
          gdt_hi                0
          gdt_lo         400001ef
          idt_hi                0
          idt_lo         60000fff
             ldt                0
            task               70
             cr0         8005003b
             cr2          80e7590
             cr3        1d23ec000
             cr4            406b8
> ::stack
vpanic()
0xfffffffffbe0ef48()
vn_createat+0x7a3(81499a8, 0, ffffff00109d9bc0, 0, 80, ffffff00109d9c78)
vn_openat+0x1be(81499a8, 0, 2302, 1b6, ffffff00109d9df0, 0)
copen+0x20c(ffd19553, 81499a8, 2302, 1b6)
openat64+0x2a(ffd19553, 81499a8, 301, 1b6)
open64+0x25(81499a8, 301, 1b6)
sys_syscall32+0x1f7()
>

Steps to reproduce:

# zfs create -o mountpoint=/nbmand -o nbmand=on rpool/nbmand
# echo 1 > /nbmand

History

#1

Updated by Gordon Ross about 3 years ago

Analysis (and fix) from Kevin Crowe

With a debug kernel this is panicing in vnode.c in an assertion on line 1494: assertion failed in_crit == 0
That's in the function vn_createat()
Specifically, in this section of code that would bother the code review nazi's:

        /*
         * In general we want to generate EROFS if the file system is
         * readonly.  However, POSIX (IEEE Std. 1003.1) section 5.3.1
         * documents the open system call, and it says that O_CREAT has no
         * effect if the file already exists.  Bug 1119649 states
         * that open(path, O_CREAT, ...) fails when attempting to open an
         * existing file on a read only file system.  Thus, the first part
         * of the following if statement has 3 checks:
         *      if the file exists &&
         *              it is being open with write access &&
         *              the file system is read only
         *      then generate EROFS
         */
        if ((*vpp != NULL && (mode & VWRITE) && ISROFILE(*vpp)) ||
            (*vpp == NULL && dvp->v_vfsp->vfs_flag & VFS_RDONLY)) {
                if (*vpp)
                        VN_RELE(*vpp);
                error = EROFS;
        } else if (excl == NONEXCL && *vpp != NULL) {

it's the "else if" branch. Interesting, there's spots for it to bail if nbmand locks are taken:
                /*
                 * File already exists.  If a mandatory lock has been
                 * applied, return error.
                 */
                vp = *vpp;
                if (VOP_REALVP(vp, &rvp, NULL) != 0)
                        rvp = vp;
                if ((vap->va_mask & AT_SIZE) && nbl_need_check(vp)) {
                        nbl_start_crit(vp, RW_READER);
                        in_crit = 1;                    <<<<<  not sure if in_crit is getting set here or elsewhere yet....
                }
                if (rvp->v_filocks != NULL || rvp->v_shrlocks != NULL) {
                        vattr.va_mask = AT_MODE|AT_SIZE;
                        if (error = VOP_GETATTR(vp, &vattr, 0, CRED(), NULL)) {
                                goto out;
                        }
                        if (MANDLOCK(vp, vattr.va_mode)) {
                                error = EAGAIN;
                                goto out;
                        }
                        /*
                         * File cannot be truncated if non-blocking mandatory
                         * locks are currently on the file.
                         */
                        if ((vap->va_mask & AT_SIZE) && in_crit) {
                                u_offset_t offset;
                                ssize_t length;

                                offset = vap->va_size > vattr.va_size ?
                                    vattr.va_size : vap->va_size;
                                length = vap->va_size > vattr.va_size ?
                                    vap->va_size - vattr.va_size :
                                    vattr.va_size - vap->va_size;
                                if (nbl_conflict(vp, NBL_WRITE, offset,
                                    length, 0, NULL)) {
                                        error = EACCES;
                                        goto out;
                                }
                        }
                }

Regardless it hits the 2nd assert in this conditional block:
                /*
                 * If the file is the root of a VFS, we've crossed a
                 * mount point and the "containing" directory that we
                 * acquired above (dvp) is irrelevant because it's in
                 * a different file system.  We apply VOP_CREATE to the
                 * target itself instead of to the containing directory
                 * and supply a null path name to indicate (conventionally)
                 * the node itself as the "component" of interest.
                 *
                 * The intercession of the file system is necessary to
                 * ensure that the appropriate permission checks are
                 * done.
                 */
                if (vp->v_flag & VROOT) {
                        ASSERT(why != CRMKDIR);
                        error = VOP_CREATE(vp, "", vap, excl, mode, vpp,
                            CRED(), flag, NULL, NULL);
                        /*
                         * If the create succeeded, it will have created
                         * a new reference to the vnode.  Give up the
                         * original reference.  The assertion should not
                         * get triggered because NBMAND locks only apply to
                         * VREG files.  And if in_crit is non-zero for some
                         * reason, detect that here, rather than when we
                         * deference a null vp.
                         */
                        ASSERT(in_crit == 0);
                        VN_RELE(vp);
                        vp = NULL;
                        goto out;
                }

#2

Updated by Gordon Ross about 3 years ago

More from Kevin:

OK, the non-debug panic makes sense too.
In vn_createat() it's in this code where it tried to create the file (VOP_CREATE) and because of earlier code in vn_createat() in_crit is set to 1.
When it tries to create the file - which just so happens to be the mount point of the zfs fs an error is returned. Ultimately the vnode is
released (VN_RELE) and vp gets set to NULL and it does a 'goto out':

                if (vp->v_flag & VROOT) {
                        ASSERT(why != CRMKDIR);
                        error = VOP_CREATE(vp, "", vap, excl, mode, vpp,
                            CRED(), flag, NULL, NULL);
                        /*
                         * If the create succeeded, it will have created
                         * a new reference to the vnode.  Give up the
                         * original reference.  The assertion should not
                         * get triggered because NBMAND locks only apply to
                         * VREG files.  And if in_crit is non-zero for some
                         * reason, detect that here, rather than when we
                         * deference a null vp.
                         */
                        ASSERT(in_crit == 0);
                        VN_RELE(vp);
                        vp = NULL;
                        goto out;
                }

the code at the out label does this, the pc is just past where things went south, specifically, it took the if(in_crit) condition and called nbl_end_crit(vp):
out:
        if (auditing)
                audit_vncreate_finish(*vpp, error);
        if (in_crit) {
                nbl_end_crit(vp);    <<<<<<<<<<<<<<<<<<<<<  (panic)
                in_crit = 0;
        }
        if (vp != NULL) {
                VN_RELE(vp);
                vp = NULL;
        }

and in nbl_end_crit() the code tries to exit the rw lock and we panic with addr = 60 (see description field of this bug), offset 60 in a vnode_t is the rwlock: v_nbllock
So, both the debug & non-debug panic's make sense.

Virtually the same code path is executed with a zfs fs that does not have nbmand set on it, the only difference is that in_crit is not set to 1, and at the out label, the code does not execute the nbl_end_crit(vp) call that panics things due to vp being NULL.

#3

Updated by Gordon Ross about 3 years ago

  • Status changed from New to In Progress
  • Assignee set to Gordon Ross
#4

Updated by Gordon Ross about 3 years ago

  • Status changed from In Progress to Pending RTI
#5

Updated by Electric Monk about 3 years ago

  • Status changed from Pending RTI to Closed
  • % Done changed from 0 to 100

git commit b3286c381cbd4e3f42795916faab84945132bbc6

commit  b3286c381cbd4e3f42795916faab84945132bbc6
Author: Kevin Crowe <kevin.crowe@nexenta.com>
Date:   2016-10-28T18:16:09.000Z

    6499 panic trying to write to the mountpoint of FS with nbmand=on
    Reviewed by: Gordon Ross <gwr@nexenta.com>
    Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
    Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
    Reviewed by: Toomas Soome <tsoome@me.com>
    Reviewed by: Robert Mustacchi <rm@joyent.com>
    Approved by: Matthew Ahrens <mahrens@delphix.com>

Also available in: Atom PDF