Project

General

Profile

Bug #3894

zfs should not allow snapshot of inconsistent dataset

Added by Keith Wesolowski about 6 years ago. Updated about 6 years ago.

Status:
Closed
Priority:
High
Category:
zfs - Zettabyte File System
Start date:
2013-07-17
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:

Description

Upon receiving a seemingly complete zfs snapshot from its leader, the mounted snapshot shows "phantom" file entries.

-rw-------  1 postgres postgres 16777216 Jul  1 16:05 000000010000000900000075
-rw-------  1 postgres postgres 16777216 Jul  1 16:05 000000010000000900000076
??????????  ? ?        ?               ?            ? 000000010000000900000077
??????????  ? ?        ?               ?            ? 000000010000000900000078
??????????  ? ?        ?               ?            ? 000000010000000900000079
??????????  ? ?        ?               ?            ? 00000001000000090000007A
??????????  ? ?        ?               ?            ? 00000001000000090000007B
??????????  ? ?        ?               ?            ? 00000001000000090000007C
??????????  ? ?        ?               ?            ? 00000001000000090000007D
-rw-------  1 postgres postgres 16777216 Jul  1 20:05 00000001000000090000007E
-rw-------  1 postgres postgres 16777216 Jul  1 21:05 00000001000000090000007F
-rw-------  1 postgres postgres 16777216 Jul  1 21:05 000000010000000900000080
-rw-------  1 postgres postgres 16777216 Jul  1 21:27 000000010000000900000081
-rw-------  1 postgres postgres 16777216 Jun 17 20:00 000000010000000900000082
-rw-------  1 postgres postgres 16777216 Jun 17 20:00 000000010000000900000083
-rw-------  1 postgres postgres 16777216 Jun 17 20:00 000000010000000900000084
-rw-------  1 postgres postgres 16777216 Jun 17 20:00 000000010000000900000085
-rw-------  1 postgres postgres 16777216 Jun 17 20:00 000000010000000900000086
-rw-------  1 postgres postgres 16777216 Jun 17 20:00 000000010000000900000087

The filesystem has corrupt dirents, pointing to nonexistent object IDs. The destination snapshot from which the filesystem was constructed is corrupt in exactly the same manner.

The source snapshot gets EBUSY when zdb attempts to open it. This strongly suggests that DS_FLAG_INCONSISTENT is set on it. This flag is set only when we are receiving the snapshot or destroying the dataset. In this case, the snapshot was taken on the source at a time when it was itself inconsistent; upon receiving it, the destination filesystem becomes similarly corrupt.

This is easily reproduced as follows:

1. zfs create pool/testfs
2a. zfs recv pool/testfs < somestream &
2b. zfs snapshot pool/testfs@snap
3. zfs send pool/testfs@snap > whatever

It's easiest to see if the stream being received has interesting contents, since that increases the window during which the snapshot will be noticeably corrupt. You can then zfs recv 'whatever' into a new dataset on the same or a different system and that filesystem will be corrupt.

History

#1

Updated by Keith Wesolowski about 6 years ago

The obvious solution here is simply to prohibit taking a snapshot of an inconsistent dataset and return EBUSY in this case. Following discussion with Chris Siden and Matt Ahrens, it turns out this would also inhibit recursive snapshots if any dataset in the tree is inconsistent. To allow this anyway (which may not be what the user wants but at least allows some progress), we should construct in userland the list of datasets to snapshot such that it contains only the consistent ones.

#2

Updated by Robert Mustacchi about 6 years ago

  • % Done changed from 80 to 100

Resolved in ca48f36f20f6098ceb19d5b084b6b3d4b8eca9fa.

#3

Updated by Christopher Siden about 6 years ago

  • Status changed from In Progress to Closed
commit ca48f36
Author: Keith M Wesolowski <wesolows@foobazco.org>
Date:   Sat Jul 27 10:51:50 2013

    3894 zfs should not allow snapshot of inconsistent dataset
    Reviewed by: Matthew Ahrens <mahrens@delphix.com>
    Approved by: Gordon Ross <gwr@nexenta.com>

Also available in: Atom PDF