Project

General

Profile

Actions

Bug #8648

closed

Fix range locking in ZIL commit codepath

Added by Prakash Surya about 6 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
zfs - Zettabyte File System
Start date:
2017-09-11
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:
External Bug:

Description

I'm opening this bug to track integration of the following ZFS on Linux commit into illumos:

commit f763c3d1df569a8d6b60bcb5e95cf07aa7a189e6
Author: LOLi <loli10K@users.noreply.github.com>
Date:   Mon Aug 21 17:59:48 2017 +0200

    Fix range locking in ZIL commit codepath

    Since OpenZFS 7578 (1b7c1e5) if we have a ZVOL with logbias=throughput
    we will force WR_INDIRECT itxs in zvol_log_write() setting itx->itx_lr
    offset and length to the offset and length of the BIO from
    zvol_write()->zvol_log_write(): these offset and length are later used
    to take a range lock in zillog->zl_get_data function: zvol_get_data().

    Now suppose we have a ZVOL with blocksize=8K and push 4K writes to
    offset 0: we will only be range-locking 0-4096. This means the
    ASSERTion we make in dbuf_unoverride() is no longer valid because now
    dmu_sync() is called from zilog's get_data functions holding a partial
    lock on the dbuf.

    Fix this by taking a range lock on the whole block in zvol_get_data().

    Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
    Closes #6238
    Closes #6315
    Closes #6356
    Closes #6477

Ezomori Nozomu provided the following reproducer that will trigger a panic:

$ sudo zpool create tank c2t{1,2,3}d0
$ sudo zfs create -o logbias=throughput -o sync=always -V 64M tank/zvol
$ sudo fio --filename=/dev/zvol/rdsk/tank/zvol --sync=1 --rw=write --bs=1K --numjobs=8 --iodepth=1 --size=10MB --name=panic

DEBUG disabled:

> ::status
debugging crash dump vmcore.0 (64-bit) from ps-trunk.dcenter
operating system: 5.11 dlpx-trunk_2017-09-09-02-00-41dacd64be (i86pc)
image uuid: 2fb6a759-0230-e9bc-b411-bd8fa8a3e7bf
panic message: BAD TRAP: type=e (#pf Page fault) rp=ffffff000e23c880 addr=20 occurred in module "zfs" due to a NULL pointer dereference
dump content: kernel pages only
> ::stack
buf_hash_remove+0x4c(ffffff03900654a8)
arc_change_state+0x32e(fffffffffbd29220, ffffff03900654a8, fffffffffbd2c1d0)
arc_release+0x288(ffffff0377cc0010, ffffff03792a88d8)
dbuf_unoverride+0x71(ffffff0398dd2640)
dbuf_redirty+0x26(ffffff0398dd2640)
dmu_buf_will_dirty+0x75(ffffff03792a88d8, ffffff039131b340)
dmu_write_uio_dnode+0x90(ffffff03911ed290, ffffff000e23ce60, 400, ffffff039131b340)
dmu_write_uio_dbuf+0x5d(ffffff038c89d108, ffffff000e23ce60, 400, ffffff039131b340)
zvol_write+0x15a(11d00000004, ffffff000e23ce60, ffffff039d3e88e0)
cdev_write+0x2d(11d00000004, ffffff000e23ce60, ffffff039d3e88e0)
spec_write+0x4c1(ffffff039d56c180, ffffff000e23ce60, 10, ffffff039d3e88e0, 0)
fop_write+0x5b(ffffff039d56c180, ffffff000e23ce60, 10, ffffff039d3e88e0, 0)
pwrite+0x194(3, 55f940, 400, ac00)
sys_syscall+0x177()

DEBUG enabled:

> ::status
debugging crash dump vmcore.1 (64-bit) from ps-trunk.dcenter
operating system: 5.11 dlpx-trunk_2017-09-09-02-00-41dacd64be (i86pc)
image uuid: 2abbd5a7-7ab1-ca9e-c6fa-b03b8546abcb
panic message: assertion failed: dr->dt.dl.dr_override_state != DR_IN_DMU_SYNC, file: ../../common/fs/zfs/dbuf.c, line: 1457
dump content: kernel pages only
> ::stack
vpanic()
0xfffffffffba99724()
dbuf_unoverride+0x11d(ffffff03877987c0)
dbuf_redirty+0x3d(ffffff03877987c0)
dmu_buf_will_dirty+0x8d(ffffff0386ad5c28, ffffff0378a79880)
dmu_write_uio_dnode+0xff(ffffff038e18f050, ffffff001020be60, 400, ffffff0378a79880)
dmu_write_uio_dbuf+0x5d(ffffff038cf0f260, ffffff001020be60, 400, ffffff0378a79880)
zvol_write+0x15a(11d00000004, ffffff001020be60, ffffff038ae196a8)
cdev_write+0x2d(11d00000004, ffffff001020be60, ffffff038ae196a8)
spec_write+0x4c1(ffffff0399847800, ffffff001020be60, 10, ffffff038ae196a8, 0)
fop_write+0x5b(ffffff0399847800, ffffff001020be60, 10, ffffff038ae196a8, 0)
pwrite+0x194(3, 55fac0, 400, 400)
sys_syscall+0x177()

Actions #1

Updated by Prakash Surya about 6 years ago

  • Description updated (diff)
Actions #2

Updated by Electric Monk about 6 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit 42b14111721da2ebd5159e7b45012a3eb0e3384c

commit  42b14111721da2ebd5159e7b45012a3eb0e3384c
Author: LOLi <loli10K@users.noreply.github.com>
Date:   2017-09-22T02:19:23.000Z

    8648 Fix range locking in ZIL commit codepath
    Reviewed by: Igor Kozhukhov <igor@dilos.org>
    Reviewed by: Matt Ahrens <mahrens@delphix.com>
    Reviewed by: Andriy Gapon <avg@FreeBSD.org>
    Reviewed by: Alexander Motin <mav@FreeBSD.org>
    Approved by: Robert Mustacchi <rm@joyent.com>

Actions

Also available in: Atom PDF