Bug #6499
closedpanic trying to write to the mountpoint of FS with nbmand=on
100%
Description
Getting the following panic trying to write (using echo, dd, ...) to the mounpoint (itself, not a file or dir inside) of the FS with nbmand=on:
> ::panicinfo cpu 5 thread ffffff02f0136840 message assertion failed: in_crit == 0, file: ../../common/fs/vnode.c, line: 1494 rdi fffffffffbf592a8 rsi ffffff00109d98d0 rdx fffffffffbf46646 rcx 5d6 r8 80 r9 ffffff00109d9c78 rax ffffff00109d98f0 rbx ffffff030e7c6900 rbp ffffff00109d9930 r10 1 r11 2 r12 ffffff00109d9c78 r13 1 r14 ffffff00109d9aa0 r15 0 fsbase 0 gsbase ffffff02ee350580 ds 4b es 4b fs 0 gs 1c3 trapno 0 err 0 rip fffffffffb86c1c0 cs 30 rflags 282 rsp ffffff00109d98c8 ss 38 gdt_hi 0 gdt_lo 400001ef idt_hi 0 idt_lo 60000fff ldt 0 task 70 cr0 8005003b cr2 80e7590 cr3 1d23ec000 cr4 406b8 > ::stack vpanic() 0xfffffffffbe0ef48() vn_createat+0x7a3(81499a8, 0, ffffff00109d9bc0, 0, 80, ffffff00109d9c78) vn_openat+0x1be(81499a8, 0, 2302, 1b6, ffffff00109d9df0, 0) copen+0x20c(ffd19553, 81499a8, 2302, 1b6) openat64+0x2a(ffd19553, 81499a8, 301, 1b6) open64+0x25(81499a8, 301, 1b6) sys_syscall32+0x1f7() >
Steps to reproduce:
# zfs create -o mountpoint=/nbmand -o nbmand=on rpool/nbmand # echo 1 > /nbmand
Updated by Gordon Ross almost 7 years ago
Analysis (and fix) from Kevin Crowe
With a debug kernel this is panicing in vnode.c in an assertion on line 1494: assertion failed in_crit == 0
That's in the function vn_createat()
Specifically, in this section of code that would bother the code review nazi's:
/* * In general we want to generate EROFS if the file system is * readonly. However, POSIX (IEEE Std. 1003.1) section 5.3.1 * documents the open system call, and it says that O_CREAT has no * effect if the file already exists. Bug 1119649 states * that open(path, O_CREAT, ...) fails when attempting to open an * existing file on a read only file system. Thus, the first part * of the following if statement has 3 checks: * if the file exists && * it is being open with write access && * the file system is read only * then generate EROFS */ if ((*vpp != NULL && (mode & VWRITE) && ISROFILE(*vpp)) || (*vpp == NULL && dvp->v_vfsp->vfs_flag & VFS_RDONLY)) { if (*vpp) VN_RELE(*vpp); error = EROFS; } else if (excl == NONEXCL && *vpp != NULL) {
it's the "else if" branch. Interesting, there's spots for it to bail if nbmand locks are taken:
/* * File already exists. If a mandatory lock has been * applied, return error. */ vp = *vpp; if (VOP_REALVP(vp, &rvp, NULL) != 0) rvp = vp; if ((vap->va_mask & AT_SIZE) && nbl_need_check(vp)) { nbl_start_crit(vp, RW_READER); in_crit = 1; <<<<< not sure if in_crit is getting set here or elsewhere yet.... } if (rvp->v_filocks != NULL || rvp->v_shrlocks != NULL) { vattr.va_mask = AT_MODE|AT_SIZE; if (error = VOP_GETATTR(vp, &vattr, 0, CRED(), NULL)) { goto out; } if (MANDLOCK(vp, vattr.va_mode)) { error = EAGAIN; goto out; } /* * File cannot be truncated if non-blocking mandatory * locks are currently on the file. */ if ((vap->va_mask & AT_SIZE) && in_crit) { u_offset_t offset; ssize_t length; offset = vap->va_size > vattr.va_size ? vattr.va_size : vap->va_size; length = vap->va_size > vattr.va_size ? vap->va_size - vattr.va_size : vattr.va_size - vap->va_size; if (nbl_conflict(vp, NBL_WRITE, offset, length, 0, NULL)) { error = EACCES; goto out; } } }
Regardless it hits the 2nd assert in this conditional block:
/* * If the file is the root of a VFS, we've crossed a * mount point and the "containing" directory that we * acquired above (dvp) is irrelevant because it's in * a different file system. We apply VOP_CREATE to the * target itself instead of to the containing directory * and supply a null path name to indicate (conventionally) * the node itself as the "component" of interest. * * The intercession of the file system is necessary to * ensure that the appropriate permission checks are * done. */ if (vp->v_flag & VROOT) { ASSERT(why != CRMKDIR); error = VOP_CREATE(vp, "", vap, excl, mode, vpp, CRED(), flag, NULL, NULL); /* * If the create succeeded, it will have created * a new reference to the vnode. Give up the * original reference. The assertion should not * get triggered because NBMAND locks only apply to * VREG files. And if in_crit is non-zero for some * reason, detect that here, rather than when we * deference a null vp. */ ASSERT(in_crit == 0); VN_RELE(vp); vp = NULL; goto out; }
Updated by Gordon Ross almost 7 years ago
More from Kevin:
OK, the non-debug panic makes sense too.
In vn_createat() it's in this code where it tried to create the file (VOP_CREATE) and because of earlier code in vn_createat() in_crit is set to 1.
When it tries to create the file - which just so happens to be the mount point of the zfs fs an error is returned. Ultimately the vnode is
released (VN_RELE) and vp gets set to NULL and it does a 'goto out':
if (vp->v_flag & VROOT) { ASSERT(why != CRMKDIR); error = VOP_CREATE(vp, "", vap, excl, mode, vpp, CRED(), flag, NULL, NULL); /* * If the create succeeded, it will have created * a new reference to the vnode. Give up the * original reference. The assertion should not * get triggered because NBMAND locks only apply to * VREG files. And if in_crit is non-zero for some * reason, detect that here, rather than when we * deference a null vp. */ ASSERT(in_crit == 0); VN_RELE(vp); vp = NULL; goto out; }
the code at the out label does this, the pc is just past where things went south, specifically, it took the if(in_crit) condition and called nbl_end_crit(vp):
out: if (auditing) audit_vncreate_finish(*vpp, error); if (in_crit) { nbl_end_crit(vp); <<<<<<<<<<<<<<<<<<<<< (panic) in_crit = 0; } if (vp != NULL) { VN_RELE(vp); vp = NULL; }
and in nbl_end_crit() the code tries to exit the rw lock and we panic with addr = 60 (see description field of this bug), offset 60 in a vnode_t is the rwlock: v_nbllock
So, both the debug & non-debug panic's make sense.
Virtually the same code path is executed with a zfs fs that does not have nbmand set on it, the only difference is that in_crit is not set to 1, and at the out label, the code does not execute the nbl_end_crit(vp) call that panics things due to vp being NULL.
Updated by Gordon Ross almost 7 years ago
- Status changed from New to In Progress
- Assignee set to Gordon Ross
Updated by Gordon Ross almost 7 years ago
- Status changed from In Progress to Pending RTI
Updated by Electric Monk almost 7 years ago
- Status changed from Pending RTI to Closed
- % Done changed from 0 to 100
git commit b3286c381cbd4e3f42795916faab84945132bbc6
commit b3286c381cbd4e3f42795916faab84945132bbc6 Author: Kevin Crowe <kevin.crowe@nexenta.com> Date: 2016-10-28T18:16:09.000Z 6499 panic trying to write to the mountpoint of FS with nbmand=on Reviewed by: Gordon Ross <gwr@nexenta.com> Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: Robert Mustacchi <rm@joyent.com> Approved by: Matthew Ahrens <mahrens@delphix.com>