Project

General

Profile

Bug #6287

Panic destroying the pool using file backing store on FS with nbmand=on

Added by Jan Kryl almost 5 years ago. Updated almost 5 years ago.

Status:
In Progress
Priority:
Normal
Assignee:
Category:
kernel
Start date:
2015-10-06
Due date:
% Done:

100%

Estimated time:
16.00 h
Difficulty:
Medium
Tags:
Gerrit CR:

Description

There is specific scenario that works with nbmand=off and debug kernel panics with nbmand=on.

> ::status
debugging crash dump vmcore.1 (64-bit) from atomic1
operating system: 5.11 NexentaStor_5.0.0.19:fcb4e039c5:debug (i86pc)
image uuid: ae03b3b7-dfaa-6b91-8144-ac70c0d8a6fe
panic message: assertion failed: vp->v_shrlocks == NULL, file: ../../common/fs/vnode.c, line: 2420
dump content: kernel pages only

Panic stack:

::stack
vpanic()
0xfffffffffbe0c7c8()
vn_free+0x36(ffffff2d37654e00)
zfs_znode_cache_destructor+0x4b(ffffff23f4befde0, 0)
kmem_cache_free_debug+0x214(ffffff2312cd8648, ffffff23f4befde0, fffffffff78d6edb)
kmem_cache_free+0x153(ffffff2312cd8648, ffffff23f4befde0)
zfs_znode_free+0x8b(ffffff23f4befde0)
zfs_zinactive+0xd6(ffffff23f4befde0)
zfs_inactive+0x75(ffffff2d37654e00, ffffff235f2252b0, 0)
fop_inactive+0x76(ffffff2d37654e00, ffffff235f2252b0, 0)
vn_rele_dnlc+0xa2(ffffff2d37654e00)
dnlc_purge_vfsp+0x1a2(ffffff2357995678, 0)
dounmount+0x50(ffffff2357995678, 400, ffffff235f2252b0)
umount2_engine+0x96(ffffff2357995678, 400, ffffff235f2252b0, 1)
umount2+0x163(9829608, 400)
_sys_sysenter_post_swapgs+0x237()

Steps to reproduce:

zpool create -O nbmand=on tpool1 c2t5000C50020E4989Bd0
mkfile 128M /tpool1/a
mkfile 128M /tpool1/b
zpool create tpool1_file mirror /tpool1/a /tpool1/b
zpool export tpool1_file
zpool destroy tpool1

History

#1

Updated by Jan Kryl almost 5 years ago

  • Tags deleted (needs-triage)
  • % Done changed from 0 to 100
  • Status changed from New to In Progress

Background:

When filesystem is mounted with nbmand=on option each open of a file implies creation of so called "share reservation" on the file. Share reservations enforce windows-like filesystem operation semantic, which use rather mandatory locking instead advisory locking by default. So each open of a file places a reservation on that file which prevents successful grabbing of conflicting mandatory locks. When the file is closed, the share reservation is removed. The way how system recognises which share reservation to remove when the file is closed is by looking up the reservation by PID and SYSID. SYSID is used if lock is acquired on behalf of a remote host. For local processes the SYSID is set to zero. When a vnode is destroyed in kernel (reference count drops to zero), it is expected to have empty list of reservations, because v_count of zero implies that the file is not open and because it's not open there should be no reservations.

Problem:

When a file on filesystem with nbmand=on is used for a vdev a new file reservation is created on it. Since the underlaying file for vdev is open'd on behalf of ZFS taskq thread, the PID recorded in share reservation is zero. When destroying the pool, the file vdev is destroyed too which results in a call to fop_close() and vn_rele on underlaying vnode of the vdev. From fop_close we get to zfs_close which finally calls cleanshares to clean the reservation however in this case zfs_close is called on behalf of user process which does zpool destroy! So the PID is not a zero. Hence matching share reservation is not found on the vnode and orphaned reservation triggers assertion panic when vnode is destroyed later on.

Solution:

Possible solution which I have successfully tested is to pass caller context structure to fop_close from vdev_file_close() with PID set to zero. This structure is then read by zfs_close and the PID of current process is overriden by the PID from caller context structure.

Also available in: Atom PDF