Project

General

Profile

Bug #10572

Fix race in dnode_check_slots_free()

Added by Toomas Soome 6 months ago. Updated 6 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
zfs - Zettabyte File System
Start date:
2019-03-20
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:

Description

The Fix from ZoL:

Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable, leading to crashes. This patch adds the ability for
dnodes to track which txg they were last dirtied in and adds a
check for this before performing the reclaim.

This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #7147
Closes #7388

This issue also seems to affect zil replay:

panic[cpu1]/thread=ffffff0272a28440: 
assertion failed: dmu_object_claim_dnsize(zfsvfs->z_os, obj, DMU_OT_PLAIN_FILE_C
ONTENTS, 0, obj_type, bonuslen, dnodesize, tx) == 0 (0x1c == 0x0), file: ../../c
ommon/fs/zfs/zfs_znode.c, line: 854

ffffff00085e4280 genunix:process_type+168005 ()
ffffff00085e43f0 zfs:zfs_mknode+6db ()
ffffff00085e4510 zfs:zfs_create+6b4 ()
ffffff00085e45b0 genunix:fop_create+c7 ()
ffffff00085e4770 zfs:zfs_replay_create+35f ()
ffffff00085e47d0 zfs:zil_replay_log_record+e2 ()
ffffff00085e49d0 zfs:zil_parse+23c ()
ffffff00085e4a40 zfs:zil_replay+b1 ()
ffffff00085e4a80 zfs:zfsvfs_setup+10d ()
ffffff00085e4af0 zfs:zfs_domount+19f ()
ffffff00085e4c20 zfs:zfs_mount+24f ()
ffffff00085e4c50 genunix:fsop_mount+1e ()
ffffff00085e4de0 genunix:domount+9d8 ()
ffffff00085e4e70 genunix:mount+167 ()
ffffff00085e4eb0 genunix:syscall_ap+8e ()
ffffff00085e4f00 unix:brand_sys_sysenter+2c6 ()

In this panic message above, we have return code 0x1c (expected 0), which is ENOSPC and we get it from dnode_hold_impl().


Related issues

Related to illumos gate - Bug #10579: Don't allow dnode allocation if dn_holds != 0Closed2019-03-22

Actions

History

#1

Updated by Toomas Soome 6 months ago

  • Description updated (diff)
#2

Updated by Dan McDonald 6 months ago

  • Related to Bug #10579: Don't allow dnode allocation if dn_holds != 0 added
#4

Updated by Electric Monk 6 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 90 to 100

git commit aa02ea01948372a32cbf08bfc31c72c32e3fc81e

commit  aa02ea01948372a32cbf08bfc31c72c32e3fc81e
Author: Tom Caputi <tcaputi@datto.com>
Date:   2019-03-27T13:27:22.000Z

    10572 Fix race in dnode_check_slots_free()
    10579 Don't allow dnode allocation if dn_holds != 0
    Reviewed by: Kody Kantor <kody.kantor@joyent.com>
    Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
    Approved by: Dan McDonald <danmcd@joyent.com>

Also available in: Atom PDF