Project

General

Profile

Bug #3189

kernel panic in ZFS test suite during hotspare_onoffline_004_neg

Added by Christopher Siden about 7 years ago. Updated almost 7 years ago.

Status:
Resolved
Priority:
Normal
Category:
zfs - Zettabyte File System
Start date:
2012-09-14
Due date:
% Done:

100%

Estimated time:
Difficulty:
Tags:
needs-triage

Description

panic[cpu1]/thread=ffffff01fe5d3860:  
assertion failed: refcount_count(&tx->tx_space_written) + delta <= tx->tx_space_tow
rite (0x360000 <= 0x300000), file: ../../common/fs/zfs/dmu_tx.c, line: 1127

ffffff00094c8d20 genunix:assfail3+b0 ()
ffffff00094c8d60 zfs:dmu_tx_willuse_space+a4 ()
ffffff00094c8da0 zfs:dnode_willuse_space+5a ()
ffffff00094c8e40 zfs:dbuf_dirty+3fa ()
ffffff00094c8ee0 zfs:dbuf_dirty+6e7 ()
ffffff00094c8f80 zfs:dbuf_dirty+6e7 ()
ffffff00094c9020 zfs:dbuf_dirty+6e7 ()
ffffff00094c90c0 zfs:dbuf_dirty+6e7 ()
ffffff00094c9160 zfs:dbuf_dirty+6e7 ()
ffffff00094c9200 zfs:dbuf_dirty+6e7 ()
ffffff00094c9260 zfs:dnode_setdirty+1ff ()
ffffff00094c9300 zfs:dbuf_dirty+7d9 ()
ffffff00094c93a0 zfs:dbuf_dirty+6e7 ()
ffffff00094c93e0 zfs:dbuf_will_dirty+84 ()
ffffff00094c94b0 zfs:dnode_free_range+310 ()
ffffff00094c9540 zfs:dmu_free_long_range_impl+16e ()
ffffff00094c95a0 zfs:dmu_free_long_range+5a ()
ffffff00094c9650 zfs:zfs_trunc+73 ()  
ffffff00094c9780 zfs:zfs_freesp+fc () 
ffffff00094c9890 zfs:zfs_create+682 ()
ffffff00094c9940 genunix:fop_create+fc ()
ffffff00094c9b10 genunix:vn_createat+625 ()
ffffff00094c9ce0 genunix:vn_openat+21f ()
ffffff00094c9e50 genunix:copen+49e () 
ffffff00094c9e80 genunix:openat64+2d ()
ffffff00094c9eb0 genunix:open64+2e () 
ffffff00094c9f00 unix:brand_sys_sysenter+2b7 ()

This failure occurs consistently and first appeared in:

commit 31495a1e56860f4575614774a592fe33fc9c71f2
Author: Arne Jansen <sensille@gmx.net>
Date:   Thu Aug 30 03:32:10 2012 -0700

    1862 incremental zfs receive fails for sparse file  > 8PB
    Reviewed by: Matt Ahrens <matthew.ahrens@delphix.com>
    Reviewed by: Simon Klinkert <klinkert@webgods.de>
    Approved by: Eric Schrock <eric.schrock@delphix.com>

I am still looking in to the root cause.


Files

panic1.jpg (108 KB) panic1.jpg Yuri Pankov, 2012-09-16 01:20 AM
panic2.jpg (91.1 KB) panic2.jpg Yuri Pankov, 2012-09-16 01:20 AM

History

#1

Updated by Matthew Ahrens about 7 years ago

diffs for 1862: http://cr.illumos.org/~webrev/sensille/count_free/usr/src/uts/common/fs/zfs/dmu_tx.c.sdiff.html

I don't understand why we are ">> epbs" here:

 575                         txh->txh_memory_tohold += MIN(blkcnt, (nl1blks >> epbs))
 576                             << dn->dn_indblkshift;

It seems like the worst case should be the lesser of (the blocks spanned at this level (blkcnt)) and (the number of level-1 blocks actually touched (nl1blks)). Doing "nl1blks >> epbs" assumes that the level-1 blocks are all packed tightly into level-2's.

#2

Updated by Arne Jansen about 7 years ago

Matthew Ahrens wrote:

diffs for 1862: http://cr.illumos.org/~webrev/sensille/count_free/usr/src/uts/common/fs/zfs/dmu_tx.c.sdiff.html

I don't understand why we are ">> epbs" here:

[...]

It seems like the worst case should be the lesser of (the blocks spanned at this level (blkcnt)) and (the number of level-1 blocks actually touched (nl1blks)). Doing "nl1blks >> epbs" assumes that the level-1 blocks are all packed tightly into level-2's.

I'd second that. Christopher, can you easily test if taking out ">> epbs" fixes the problem?

#3

Updated by Christopher Siden about 7 years ago

That fix lets it go further, but it still fails the same assertion during a later test:

Running root test case:  zvol_misc_003_neg | PASS
Running root test case:  zvol_misc_004_pos | 
panic[cpu2]/thread=ffffff02015d47a0: assertion failed: refcount_count(&tx->tx_space_written) + delta <= tx->tx_space_towrite (0x3c0000 <= 0x360000), file: ../../common/fs/zfs/dmu_tx.c, line: 1127

ffffff00096a3df0 genunix:assfail3+b0 ()
ffffff00096a3e30 zfs:dmu_tx_willuse_space+a4 ()
ffffff00096a3e70 zfs:dnode_willuse_space+5a ()
ffffff00096a3f10 zfs:dbuf_dirty+3fa ()
ffffff00096a3fb0 zfs:dbuf_dirty+6e7 ()
ffffff00096a4050 zfs:dbuf_dirty+6e7 ()
ffffff00096a40f0 zfs:dbuf_dirty+6e7 ()
ffffff00096a4190 zfs:dbuf_dirty+6e7 ()
ffffff00096a4230 zfs:dbuf_dirty+6e7 ()
ffffff00096a42d0 zfs:dbuf_dirty+6e7 ()
ffffff00096a4330 zfs:dnode_setdirty+1ff ()
ffffff00096a43d0 zfs:dbuf_dirty+7d9 ()
ffffff00096a4470 zfs:dbuf_dirty+6e7 ()
ffffff00096a4510 zfs:dbuf_dirty+6e7 ()
ffffff00096a4550 zfs:dbuf_will_dirty+84 ()
ffffff00096a4620 zfs:dnode_free_range+310 ()
ffffff00096a46b0 zfs:dmu_free_long_range_impl+16e ()
ffffff00096a4710 zfs:dmu_free_long_range+5a ()
ffffff00096a47a0 zfs:zvol_dump_init+7c ()
ffffff00096a47f0 zfs:zvol_dumpify+157 ()
ffffff00096a48f0 zfs:zvol_ioctl+22f ()
ffffff00096a4990 zfs:zfsdev_ioctl+42e ()
ffffff00096a49d0 genunix:cdev_ioctl+45 ()
ffffff00096a4a10 specfs:spec_ioctl+5a ()
ffffff00096a4a90 genunix:fop_ioctl+7b ()
ffffff00096a4bf0 genunix:dumpinit+3aa ()
ffffff00096a4cb0 dump:dump_ioctl+36a ()
ffffff00096a4cf0 genunix:cdev_ioctl+45 ()
ffffff00096a4d30 specfs:spec_ioctl+5a ()
ffffff00096a4db0 genunix:fop_ioctl+7b ()
ffffff00096a4eb0 genunix:ioctl+18e ()
ffffff00096a4f00 unix:brand_sys_sysenter+2b7 ()
#4

Updated by Yuri Pankov about 7 years ago

I hope the following will be useful..

#5

Updated by Matthew Ahrens almost 7 years ago

In the latest panic we are in this code in dnode_free_range():

     * Always dirty the first and last indirect to make sure
     * we dirty all the partial indirects.
     */
    if (dn->dn_nlevels > 1) {
        uint64_t i, first, last;
        int shift = epbs + dn->dn_datablkshift;

        first = blkid >> epbs;
        if (db = dbuf_hold_level(dn, 1, first, FTAG)) {
            dbuf_will_dirty(db, tx);
            dbuf_rele(db, FTAG);
        }

In this case the dnode has dn_nlevels=4 but dn_maxblkid=0 (it must have been previously cleared out with the same dnode_free_range(everything) that we are doing now).

The old dmu_tx_count_free() handled this because it always assumed that at least one indirect block on every level will be dirtied (which is what the code in dnode_free_range() actually does). But the new dmu_tx_count_free() only counts indirects if it finds some that will need to be modified (i.e. nl1blks > 0).

#6

Updated by Arne Jansen almost 7 years ago

So we arrive at something like

575                         txh->txh_memory_tohold += MIN(MIN(blkcnt, nl1blks), 1)
576                             << dn->dn_indblkshift;

2 bugs in one line...

#7

Updated by Arne Jansen almost 7 years ago

575                         txh->txh_memory_tohold += MAX(MIN(blkcnt, nl1blks), 1)
576                             << dn->dn_indblkshift;

is of course what I meant. I really need to get the zfs test suite running here...

#8

Updated by Christopher Siden almost 7 years ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

changeset: 13818:e9ad0a945d45
tag: tip
user: Christopher Siden <>
date: Wed Sep 19 15:53:16 2012 -0400

description:
3189 kernel panic in ZFS test suite during hotspare_onoffline_004_neg
Reviewed by: Matthew Ahrens <>
Reviewed by: Arne Jansen <>
Approved by: Dan McDonald <>

modified:
usr/src/lib/libzfs/common/llib-lzfs
usr/src/uts/common/fs/zfs/dmu_tx.c

Also available in: Atom PDF