Project

General

Profile

Actions

Bug #11950

closed

diff_cb() does not handle large dnodes

Added by Jason King over 2 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
zfs - Zettabyte File System
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

While working on #11947, I tried to port and run the existing ZoL zfs diff tests to illumos. With a few minor tweaks, they all ran successfully except one test (distilled here):

zfs create -o dnodesize=4k -o encryption=on -o keyformat=passphrase test/testfs
...enter passphrase...
zfs snapshot test/testfs@1
touch /test/testfs/foo
zfs snapshot test/testfs@2
zfs diff -Ft test/testfs@1 test/testfs@2

The zfs diff command fails with something similar to:

1573328760.330576842    +    F    /test/testfs/foo
Unable to determine path or stats for object 3 in test/testfs@2: File exists

When the same sequence is run on a non-encrypted dataset:

zfs create -o dnodesize=4k test/testfs
zfs snapshot test/testfs@1
touch /test/testfs/foo
zfs snapshot test/testfs@2
zfs diff -Ft test/testfs@1 test/testfs@2

The zfs diff command succeeds:

1573328843.769111672    +    F    /test/testfs/foo
1573328843.768976616    M    /    /test/testfs/

Using a bit of dtrace -- dtrace -n 'zfs::set-error /arg0==17/ { stack() }' -- showed that dnode_hold_impl was the source of EEXIST (17). Looking at the code, the following bit of code is responsible:

    if (flag & DNODE_MUST_BE_ALLOCATED) {
        slots = 1;

        while (dn == DN_SLOT_UNINIT) {
            dnode_slots_hold(dnc, idx, slots);
            dnh = &dnc->dnc_children[idx];

            if (DN_SLOT_IS_PTR(dnh->dnh_dnode)) {
                dn = dnh->dnh_dnode;
                break;
            } else if (dnh->dnh_dnode == DN_SLOT_INTERIOR) {
                DNODE_STAT_BUMP(dnode_hold_alloc_interior);
                dnode_slots_rele(dnc, idx, slots);
                dbuf_rele(db, FTAG);
                return (SET_ERROR(EEXIST));

This can further be confirmed by examining the dnode_hold_alloc_interior zfs kstat and seeing that it is incrementing after the error.

Also interesting is that running zdb shows that there isn't an object 3 on the dataset or either snapshot (though perhaps it's a detail of the zfs diff implementation I'm just not familiar enough with).

Actions #1

Updated by Jason King over 2 years ago

  • Subject changed from zfs diff can fail with EEXIST on encrypted datasets with large nodes to zfs diff can fail with EEXIST on encrypted datasets with large dnodes
Actions #2

Updated by Jason King over 2 years ago

This is the same issue as ZoL PR diff_cb() does not handle encrypted large dnodes #9343.
The same fix also fixes the issue here.
(hat tip to Brian Belendorf for the pointer).

Actions #3

Updated by Jason King over 2 years ago

From ZoL PR#9343

Trying to 'zfs diff' a snapshot with large dnodes will incorrectly try
to access its interior slots when dnodesize > sizeof(dnode_phys_t).
This is normally not an issue because the interior slots are
zero-filled, which report_dnode() handles calling
report_free_dnode_range(). However this is not the case for encrypted
large dnodes or filesystem using many SA based xattrs where the extra
data past the legacy dnode size boundary is interpreted as a
dnode_phys_t.
Reviewed-by: Brian Behlendorf <>
Reviewed-by: Tom Caputi <>
Reviewed-by: Ryan Moeller <>
Signed-off-by: loli10K <>
Actions #4

Updated by Jason King over 2 years ago

  • Subject changed from zfs diff can fail with EEXIST on encrypted datasets with large dnodes to diff_cb() does not handle large dnodes
Actions #5

Updated by Jason King over 2 years ago

  • Assignee set to Jason King
Actions #6

Updated by Jason King over 2 years ago

To test, I built a BE with this change and ran the zfs test suite. All the expected tests (including the one updated by this change) pass. I had previously run the updated test on a SmartOS system without this change and it failed.

Actions #7

Updated by Electric Monk over 2 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit 99d3b4e271d47a93935645d0c2d348d161f90c80

commit  99d3b4e271d47a93935645d0c2d348d161f90c80
Author: loli10K <loli10K@users.noreply.github.com>
Date:   2019-12-03T20:20:45.000Z

    11950 diff_cb() does not handle large dnodes
    Portions contributed by: Jason King <jason.king@joyent.com>
    Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
    Reviewed by: Tom Caputi <tcaputi@datto.com>
    Reviewed by: Ryan Moeller <ryan@ixsystems.com>
    Reviewed by: Kody Kantor <kody.kantor@joyent.com>
    Reviewed by: Matthias Scheler <mscheler@tintri.com>
    Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
    Reviewed by: Andy Fiddaman <andy@omniosce.org>
    Approved by: Dan McDonald <danmcd@joyent.com>

Actions

Also available in: Atom PDF