Bug #6729
incremental replication stream of a fs tree with lots of snapshots trips assert in zfs recv
100%
Description
The assertion in question is:
Assertion failed: ilen <= SPA_MAXBLOCKSIZE, file ../common/libzfs_sendrecv.c, line 1955, function recv_read
ilen comes from the dmu_replay_record_t's drr_payloadlen member:
> $c libc.so.1`_lwp_kill+7(1, 6, 80467a8, fee91cdd, fef63000, 8046800) libc.so.1`raise+0x22(6, 0, 80467e8, fee6edb9) libc.so.1`abort+0xf3(8046800, 8046800, 6c, fed732cc, 65737341, 6f697472) libc.so.1`_assert(fedabf16, fedabefa, 7a3, feda93ad) libzfs.so.1`recv_read+0x3d(80a4548, 0, 80a8008, 1278128, 0, 8047380) libzfs.so.1`recv_read_nvlist+0x4a(80a4548, 0, 1278128, 804721c, 0, 8047380) libzfs.so.1`zfs_receive_package+0x100(80a4548, 0, 8047da0, 8047bdc, 80478d8, 8047380) libzfs.so.1`zfs_receive_impl+0x527(80a4548, 8047da0, 0, 8047bdc, 0, 0) libzfs.so.1`zfs_receive+0xc3(80a4548, 8047da0, 80a1f88, 8047bdc, 0, 0) zfs_do_receive+0x3dd(3, 8047cac, 80768a0, 801, 0, 3) main+0x22c(fee10180, fef6d6a8, 8047c9c, 80557f7, 4, 8047ca8) _start+0x83(4, 8047d94, 8047d98, 8047d9d, 8047da0, 0) > 80478d8::print dmu_replay_record_t { drr_type = 0 (DRR_BEGIN) drr_payloadlen = 0x1278128 [ ... ]
zstreamdump reveals that all snapshot names of the source datasets are
present in the stream, which is what I suppose is making the record larger than
recv anticipates. Here's one way to repro: create a bunch of filesystems and
recursively snapshot them lots of times using long snapshot names:
# zfs create rpool/foo # for i in $(seq 1 160); do zfs create rpool/foo/$i; done # for i in $(seq 1 1000); do zfs snapshot -r rpool/foo@abcdefghijklmnopqrstvwxyz$i; done
Then generate an incremental replication stream between any two snapshots just
created, and attempt to receive it (doesn't matter where, and receive -n is
fine):
# zfs send -Ri abcdefghijklmnopqrstuvwxyz999 rpool/foo@abcdefghijklmnopqrstuvwxyz1000 > /var/tmp/foo.stream # zfs recv -n rpool/foo < /var/tmp/foo.stream Assertion failed: ilen <= SPA_MAXBLOCKSIZE, file ../common/libzfs_sendrecv.c, line 1955, function recv_read zsh: IOT instruction (core dumped) zfs recv -n rpool/foo < /var/tmp/foo.stream
Updated by Chip Schweiss over 2 years ago
Anyone looking at this?
By taking advantage of Channel Programs, this bug is becoming more frequent.
Here's a core dump from my latest encounter:
Updated by Jason King 5 months ago
It appears OpenZFS recently fixed this in commit 7a6c12fd6a756af5a2f664c0a6a292d22fbb2487:
Don't assert on nvlists larger than SPA_MAXBLOCKSIZE
Originally we asserted that all reads are less than SPA_MAXBLOCKSIZE
However, nvlists are not ZFS records, and are not limited to
SPA_MAXBLOCKSIZE.
Add a new environment variable, ZFS_SENDRECV_MAX_NVLIST, to allow the
user to specify the maximum size of the nvlist that can be sent or
received.
Default value: 4 * SPA_MAXBLOCKSIZE (64 MB)
Modify libzfs send routines to return a useful error if the send stream
will generate an nvlist that is beyond the maximum size.
Modify libzfs recv routines to add an explicit error message if the
nvlist is too large, rather than abort()ing.
Move the change the assert() to only trigger on data records
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #9616
Looking at the change, it should be fairly straightforward to port this over.
Updated by Jason King 4 months ago
A followup change should also be included:
commit 908d43d0a9f736af62c0f4b179950bb1262dfd7d Author: Allan Jude <allanjude@freebsd.org> Date: Fri Sep 18 13:23:29 2020 -0400 libzfs: Don't leak buf if nvlist is too large Resolves FreeBSD Coverity defect: CID 1432398: Resource leaks (RESOURCE_LEAK) libzfs: don't leak hdl if there is an error reading env var Resolves FreeBSD Coverity defect: CID 1432395: Resource leaks (RESOURCE_LEAK) Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Allan Jude <allanjude@freebsd.org> Closes #10882
Updated by Jason King 4 months ago
I tested this by running the zfs test suite. The only failures were tests with known failure.
Updated by Jason King about 2 months ago
Additionally, I was able to recreate the assertion failure using the steps documented in the ticket. When booted on a BE with the change applied, the zfs recv command no longer fails, but succeeds:
root@pi:~# zpool create testpool c3t0d0 root@pi:~# zfs create testpool/foo root@pi:~# for f in $(seq 1 160); do zfs create testpool/foo/$f; done root@pi:~# for f in $(seq 1 1000); do zfs snapshot -r testpool/foo@abcdefghijklmnopqrstuvwxyz${f}; done root@pi:~# zfs send -Ri abcdefghijklmnopqrstuvwxyz999 testpool/foo@abcdefghijklmnopqrstuvwxyz1000 > /ws/foo.zfs root@pi:~# zfs recv -n testpool/foo < /ws/foo.zfs Assertion failed: ilen <= SPA_MAXBLOCKSIZE, file ../common/libzfs_sendrecv.c, line 2201, function recv_read Abort (core dumped)
On a BE w/ the change:
root@pi:/ws# zfs recv -n testpool/foo < foo.zfs root@pi:/ws# echo $? 0
Updated by Electric Monk about 2 months ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit c5286370b84c690a18e8100a5237a1000d7e29c6
commit c5286370b84c690a18e8100a5237a1000d7e29c6 Author: Allan Jude <allan@klarasystems.com> Date: 2021-02-19T15:11:37.000Z 6729 incremental replication stream of a fs tree with lots of snapshots trips assert in zfs recv Portions contributed by: Jason King <jason.king@joyent.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Kjeld Schouten <kjeld@schouten-lebbing.nl> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Andy Fiddaman <andy@omnios.org> Reviewed by: Toomas Soome <tsoome@me.com> Approved by: Gordon Ross <gordon.w.ross@gmail.com>