Project

General

Profile

Bug #5273

Incremental zfs send unmounts underlying snapshots, causing application which opens files from the snapshots to return I/O error.

Added by Youzhong Yang over 5 years ago. Updated about 4 years ago.

Status:
New
Priority:
High
Assignee:
-
Category:
zfs - Zettabyte File System
Start date:
2014-10-29
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

This issue is probably introduced by the fix for issue 3740 (https://www.illumos.org/issues/3740).

Here is how to reproduce:

1. Suppose there's a zpool named zp01. In terminal windows A, run the following commands:

  zfs create -o mountpoint=/zp01/src zp01/src
  echo abcdefghijklmnopqrstuvwxyz > /zp01/src/file
  zfs snapshot zp01/src@snap01
  zfs snapshot zp01/src@snap02
  zfs snapshot zp01/src@snap03
  zfs snapshot zp01/src@snap04
  ls -l /zp01/src/.zfs/snapshot/* > /dev/null
  zfs create zp01/dst

2. In terminal window B, run the following dtrace command:

dtrace -n 'fbt:zfs:zfsvfs_teardown:entry /strstr(stringof(((struct zfsvfs *)arg0)->z_vfs->vfs_mntpt->rs_string), "snapshot") != 0/ { printf("pid %d, execname %s, unmounting %d, %s, %Y", pid, execname != 0 ? execname : "NULL", arg1, stringof(((struct zfsvfs *)arg0)->z_vfs->vfs_mntpt->rs_string), walltimestamp ); stack(); ustack(); }'

3. In terminal window C, compile the attached C file, and then run it to open and read /zp01/src/.zfs/snapshot/snap02/file

-> cat read-file.c
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
    char *filename = NULL;
    int fd;
    int ret;
    char buf[16];
    int seconds = 300;

    if(argc >= 2) {
       filename = argv[1];
       if(argc >= 3) seconds = atoi(argv[2]);
    }
    else {
       printf("%s <file name>\\n", argv[0]);
       exit(1);
    }

    fd = open(filename, O_RDONLY);
    if(fd == -1) {
       perror("open");
       exit(1);
    }

    while(seconds--) {
       ret = pread(fd, buf, 16, 0);
       if(ret == -1) {
          perror("pread");
          break;
       }
       sleep(1);
    }

    ret = close(fd);
    if(ret == -1) {
       perror("close");
       exit(1);
    }

    return(0);
}

-> gcc -o read-file read-file.c
-> ./read-file /zp01/src/.zfs/snapshot/snap02/file

4. Go to terminal window A, run the following commands:
-> zfs send zp01/src@snap01 | zfs recv -e zp01/dst
-> zfs send -I zp01/src@snap01 zp01/src@snap04 | zfs recv -e -F zp01/dst

You will see the following error message in terminal window C:
pread: I/O error
close: I/O error

and in terminal window B, you will see something like the following:

CPU     ID                    FUNCTION:NAME
  5  49994            zfsvfs_teardown:entry pid 6058, execname zfs, unmounting 1, /zp01/src/.zfs/snapshot/snap01, 2014 Oct 28 23:51:14
              zfs`zfs_umount+0xfc
              genunix`fsop_unmount+0x1b
              genunix`dounmount+0x57
              zfs`zfs_unmount_snap+0x63
              zfs`dsl_dataset_user_release_impl+0xc3
              zfs`dsl_dataset_user_release_tmp+0x1d
              zfs`dsl_dataset_user_release_onexit+0x8b
              zfs`zfs_onexit_destroy+0x43
              zfs`zfs_ctldev_destroy+0x18
              zfs`zfsdev_close+0x89
              genunix`dev_close+0x31
              specfs`device_close+0xd8
              specfs`spec_close+0x17b
              genunix`fop_close+0x61
              genunix`closef+0x5e
              genunix`closeandsetf+0x398
              genunix`close+0x13
              unix`_sys_sysenter_post_swapgs+0x149

              libc.so.1`__close+0x15
              libzfs.so.1`zfs_send+0x96a
              zfs`zfs_do_send+0x598
              zfs`main+0x22c
              zfs`_start+0x83

  5  49994            zfsvfs_teardown:entry pid 6058, execname zfs, unmounting 1, /zp01/src/.zfs/snapshot/snap02, 2014 Oct 28 23:51:14
              zfs`zfs_umount+0xfc
              genunix`fsop_unmount+0x1b
              genunix`dounmount+0x57
              zfs`zfs_unmount_snap+0x63
              zfs`dsl_dataset_user_release_impl+0xc3
              zfs`dsl_dataset_user_release_tmp+0x1d
              zfs`dsl_dataset_user_release_onexit+0x8b
              zfs`zfs_onexit_destroy+0x43
              zfs`zfs_ctldev_destroy+0x18
              zfs`zfsdev_close+0x89
              genunix`dev_close+0x31
              specfs`device_close+0xd8
              specfs`spec_close+0x17b
              genunix`fop_close+0x61
              genunix`closef+0x5e
              genunix`closeandsetf+0x398
              genunix`close+0x13
              unix`_sys_sysenter_post_swapgs+0x149

              libc.so.1`__close+0x15
              libzfs.so.1`zfs_send+0x96a
              zfs`zfs_do_send+0x598
              zfs`main+0x22c
              zfs`_start+0x83

  5  49994            zfsvfs_teardown:entry pid 6058, execname zfs, unmounting 1, /zp01/src/.zfs/snapshot/snap03, 2014 Oct 28 23:51:14
              zfs`zfs_umount+0xfc
              genunix`fsop_unmount+0x1b
              genunix`dounmount+0x57
              zfs`zfs_unmount_snap+0x63
              zfs`dsl_dataset_user_release_impl+0xc3
              zfs`dsl_dataset_user_release_tmp+0x1d
              zfs`dsl_dataset_user_release_onexit+0x8b
              zfs`zfs_onexit_destroy+0x43
              zfs`zfs_ctldev_destroy+0x18
              zfs`zfsdev_close+0x89
              genunix`dev_close+0x31
              specfs`device_close+0xd8
              specfs`spec_close+0x17b
              genunix`fop_close+0x61
              genunix`closef+0x5e
              genunix`closeandsetf+0x398
              genunix`close+0x13
              unix`_sys_sysenter_post_swapgs+0x149

              libc.so.1`__close+0x15
              libzfs.so.1`zfs_send+0x96a
              zfs`zfs_do_send+0x598
              zfs`main+0x22c
              zfs`_start+0x83

  5  49994            zfsvfs_teardown:entry pid 6058, execname zfs, unmounting 1, /zp01/src/.zfs/snapshot/snap04, 2014 Oct 28 23:51:14
              zfs`zfs_umount+0xfc
              genunix`fsop_unmount+0x1b
              genunix`dounmount+0x57
              zfs`zfs_unmount_snap+0x63
              zfs`dsl_dataset_user_release_impl+0xc3
              zfs`dsl_dataset_user_release_tmp+0x1d
              zfs`dsl_dataset_user_release_onexit+0x8b
              zfs`zfs_onexit_destroy+0x43
              zfs`zfs_ctldev_destroy+0x18
              zfs`zfsdev_close+0x89
              genunix`dev_close+0x31
              specfs`device_close+0xd8
              specfs`spec_close+0x17b
              genunix`fop_close+0x61
              genunix`closef+0x5e
              genunix`closeandsetf+0x398
              genunix`close+0x13
              unix`_sys_sysenter_post_swapgs+0x149

              libc.so.1`__close+0x15
              libzfs.so.1`zfs_send+0x96a
              zfs`zfs_do_send+0x598
              zfs`main+0x22c
              zfs`_start+0x83

Files

read-file.c (794 Bytes) read-file.c Youzhong Yang, 2014-10-29 04:16 AM

History

#1

Updated by Jan Schlien over 4 years ago

Any progress on this?

I'm experiencing the same problem. My application traverses a snapshot and suffers from EIO errors. Shouldn't an open file within the snapshot prevent unmounting (EBUSY)? Why does zfsctl_snapdir_lookup() do VFS_RELE()? I understand the nfs related comment on snapdirs, but that does not imply the kernel cannot hold a reference on the snapshot.

As a workaround, is it possible to manually hold and release a reference like VFS_HOLD/VFS_RELE do from user mode for the duration of a custom application's snapshot operations?

Being able to remount the snapshot right away is no solution, it would be quite a mess to deal with intermittent EIOs in all applications just because zfs decides to force-umount snapshots once in a while.

#2

Updated by Matthew Ahrens about 4 years ago

In this use case, the unmounting is not really necessary. The issue here is:

  • "zfs send -I" (and "zfs send -R") put holds (a la "zfs hold") on the snapshots that they send
  • they also release the holds. When the hold is released, we unmount the snapshot because we might need to destroy the snapshot (if it was marked for defer destroy, and this is the last hold).

As you can see in the stack trace, the unmount happens in dsl_dataset_user_release_impl(). We do the unmount even when it isn't necessary (e.g. when the snapshot is not marked for defer destroy) because we don't have locking to ensure that it isn't marked defer destroy after we make a check and determine that we don't need to unmount it.

It might be possible to add some sort of locking between checking if we should unmount, and doing the defer destroy. But there's no obvious way to do this today. (It would be great if we could move the unmount into syncing context where we hold the dp_config_rwlock, but that doesn't work today.)

Another "fix" would be to accept the racy-ness and do the unmount only when we think we need to destroy the snapshot, and if we were wrong, then the release will fail with EBUSY.

Also available in: Atom PDF