Bug #10572

Updated by Toomas Soome over 3 years ago

The Fix from ZoL: 
 Currently, dnode_check_slots_free() works by checking dn->dn_type 
 in the dnode to determine if the dnode is reclaimable. However, 
 there is a small window of time between dnode_free_sync() in the 
 first call to dsl_dataset_sync() and when the useraccounting code 
 is run when the type is set DMU_OT_NONE, but the dnode is not yet 
 evictable, leading to crashes. This patch adds the ability for 
 dnodes to track which txg they were last dirtied in and adds a 
 check for this before performing the reclaim. 

 This patch also corrects several instances when dn_dirty_link was 
 treated as a list_node_t when it is technically a multilist_node_t. 

 Reviewed-by: Brian Behlendorf <> 
 Signed-off-by: Tom Caputi <> 
 Closes #7147 
 Closes #7388 

 This issue also seems to affect zil replay: 
 assertion failed: dmu_object_claim_dnsize(zfsvfs->z_os, obj, DMU_OT_PLAIN_FILE_C 
 ONTENTS, 0, obj_type, bonuslen, dnodesize, tx) == 0 (0x1c == 0x0), file: ../../c 
 ommon/fs/zfs/zfs_znode.c, line: 854 

 ffffff00085e4280 genunix:process_type+168005 () 
 ffffff00085e43f0 zfs:zfs_mknode+6db () 
 ffffff00085e4510 zfs:zfs_create+6b4 () 
 ffffff00085e45b0 genunix:fop_create+c7 () 
 ffffff00085e4770 zfs:zfs_replay_create+35f () 
 ffffff00085e47d0 zfs:zil_replay_log_record+e2 () 
 ffffff00085e49d0 zfs:zil_parse+23c () 
 ffffff00085e4a40 zfs:zil_replay+b1 () 
 ffffff00085e4a80 zfs:zfsvfs_setup+10d () 
 ffffff00085e4af0 zfs:zfs_domount+19f () 
 ffffff00085e4c20 zfs:zfs_mount+24f () 
 ffffff00085e4c50 genunix:fsop_mount+1e () 
 ffffff00085e4de0 genunix:domount+9d8 () 
 ffffff00085e4e70 genunix:mount+167 () 
 ffffff00085e4eb0 genunix:syscall_ap+8e () 
 ffffff00085e4f00 unix:brand_sys_sysenter+2c6 () 

 In this panic message above, we have return code 0x1c (expected 0), which is ENOSPC and we get it from dnode_hold_impl().