Project

General

Profile

Bug #1862

incremental zfs receive fails for sparse file > 8PB

Added by Arne Jansen almost 8 years ago. Updated about 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
kernel
Start date:
2011-12-07
Due date:
% Done:

0%

Estimated time:
Difficulty:
Bite-size
Tags:
needs-triage

Description

This is a description based on my incomplete knowledge of ZFS send/receive internals. When updates for sparse files are sent, the stream contains "free"-items that get processed by restore_free. Restore_free calculates a worst-case estimation on how much RAM is needed for the operation. When not enough RAM is available, the recv fails with ENOMEM.
Updates to very large sparse files lead to very large "free"-items and in turn to large allocation attempts (several GB).
I suggest changing restore_free to something along the line of

int
restore_free(struct restorearg *ra, objset_t *os,
                   struct drr_free *drrf)
{
        int err = 0;
        uint64_t offset = drrf->drr_offset;
        uint64_t length = drrf->drr_length;

        if (length != -1ULL && offset + length < offset)
                return (EINVAL);

        if (dmu_object_info(os, drrf->drr_object, NULL) != 0)
                return (EINVAL);

        while (length) {
                uint64_t l = length > 35184372088832ull ?
                             35184372088832ull : length;
                err = dmu_free_long_range(os, drrf->drr_object, offset, l);
                if (err)
                        break;
                offset += l;
                length -= l;
        }
        return (err);
}

i.e., processing the item in smaller chunks. In my short tests it works, but it is not properly tested.


Files

History

#1

Updated by Simon Klinkert over 7 years ago

I reproduced it with the following commands:

zfs create tank/test1
dd if=/dev/zero of=/tank/test1/sparsefile bs=1 count=1 seek=30000000000000000
zfs snapshot tank/test1@snap1
dd if=/dev/zero of=/tank/test1/sparsefile bs=1 count=1 seek=30000000000000001
zfs snapshot tank/test1@snap2
zfs send tank/test1@snap1 > /tmp/s.snap1
zfs send -i tank/test1@snap1  tank/test1@snap2 > /tmp/s.snap2
zfs create tank/test2
zfs receive -d tank/test2 < /tmp/s.snap1
zfs receive -d tank/test2 < /tmp/s.snap2
<internal error / zfs core dump>

Consider the output of

zstreamdump -v < /tmp/s.snap2

You need a line with something like this to reproduce the problem:

FREE object = 8 offset = 30000000000065536 length = 546460752303357952

I wrote a little patch to fix this problem. It's based on the suggested restore_free() source code (see initial post). I tested it on half a dozen OpenIndiana 151a machines and I think the problem is fixed.

You may also want to check the return value of restore_free():

dtrace -n 'restore_free:return { trace(arg1); }'

(should be zero)

#2

Updated by Eric Schrock about 7 years ago

  • Status changed from New to Resolved

changeset: 13789:f0c17d471b7a
tag: tip
user: Arne Jansen <>
date: Thu Aug 30 03:32:10 2012 -0700

description:
1862 incremental zfs receive fails for sparse file > 8PB
Reviewed by: Matt Ahrens <>
Reviewed by: Simon Klinkert <>
Approved by: Eric Schrock <>

Also available in: Atom PDF