Project

General

Profile

Bug #5905

wrong dn_maxblkid on dnode received to large_block dataset

Added by Toomas Soome about 5 years ago. Updated about 5 years ago.

Status:
New
Priority:
High
Assignee:
-
Category:
-
Start date:
2015-05-04
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

I have cloned the dataset recordsize=128k to another pool and new recordsize is 1M, I think I received to 1M dataset.

while checking file:
rw-r--r- 1 root root 39053312 apr 16 13:57 /platform/i86pc/amd64/boot_archive

with bootloader standalone zfs reader, i have found the dn_maxblkid is wrong in received file (values are in hex):

grub-core/fs/zfs/zfs.c:3870: read blksz 20000
grub-core/fs/zfs/zfs.c:3894: blkid: 128 dn_maxblkid: 127
grub-core/fs/zfs/zfs.c:3894: blkid: 129 dn_maxblkid: 127

while original has correct value:
grub-core/fs/zfs/zfs.c:3870: read blksz 20000
grub-core/fs/zfs/zfs.c:3894: blkid: 128 dn_maxblkid: 129
grub-core/fs/zfs/zfs.c:3894: blkid: 129 dn_maxblkid: 129

so, I did re-generate this boot_archive, and new one has also correct dn_maxblkid - 129.

note, neither grub-0.97 in illumos nor grub2 does not check blkid against dn_maxblkid, however the freebsd loader does make sure blkid is not greater than dn_maxblkid.

History

#1

Updated by Matthew Ahrens about 5 years ago

I'm having a little trouble understanding the steps you took to get into this situation. Can you specify which zfs commands you ran, and their arguments?

#2

Updated by Toomas Soome about 5 years ago

Matthew Ahrens wrote:

I'm having a little trouble understanding the steps you took to get into this situation. Can you specify which zfs commands you ran, and their arguments?

apparently, the large_block may even not be related; the source DS:
raid/ROOT/test recordsize 128K default

  1. zfs snapshot raid/ROOT/test@today
  2. zfs send raid/ROOT/test@today | zfs receive -e rpool/ROOT
    now the received DS has same recordsize:
    rpool/ROOT/test recordsize 128K default
running grub2 fstest to compare files:
  1. ./grub-fstest -r loop0 /dev/rdsk/c3t0d0s0 cmp /ROOT/test/@/platform/i86pc/amd64/boot_archive /platform/i86pc/amd64/boot_archive && echo OK
    OK
    does compare work?
  2. ./grub-fstest -r loop0 /dev/rdsk/c3t0d0s0 cmp /ROOT/test/@/platform/i86pc/amd64/boot_archive /platform/i86pc/boot_archive && echo OK
    ./grub-fstest: error: compare fail at offset 32848.

and from debug log:

grub-core/fs/zfs/zfs.c:3870: read blksz 20000
grub-core/fs/zfs/zfs.c:3894: blkid: 128 dn_maxblkid: 127
grub-core/fs/zfs/zfs.c:3894: blkid: 129 dn_maxblkid: 127

so, even this trivial send | receive did reproduce it.

does same happens on kernel itself:
grub-core/fs/zfs/zfs.c:3870: read blksz 20000
grub-core/fs/zfs/zfs.c:3894: blkid: d dn_maxblkid: e
grub-core/fs/zfs/zfs.c:3894: blkid: e dn_maxblkid: e

nope it does not, but then again, it may be related to size of file. ok, lets rewrite the file:
root@test:/home/tsoome# mount -F zfs rpool/ROOT/test /mnt
root@test:/home/tsoome# cp /platform/i86pc/amd64/boot_archive /mnt/platform/i86pc/amd64/boot_archive
root@test:/home/tsoome# umount /mnt
nop, it did not help:
grub-core/fs/zfs/zfs.c:3894: blkid: 129 dn_maxblkid: 127

ok, remove and create new:
root@test:/home/tsoome# mount -F zfs rpool/ROOT/test /mnt
root@test:/home/tsoome# rm /mnt/platform/i86pc/amd64/boot_archive
root@test:/home/tsoome# cp /platform/i86pc/amd64/boot_archive /mnt/platform/i86pc/amd64/boot_archive
root@test:/home/tsoome# umount /mnt
hm, still same:
grub-core/fs/zfs/zfs.c:3894: blkid: 129 dn_maxblkid: 127

ok, but I still have snapshot on it; destroy it:
root@test:/home/tsoome# zfs destroy rpool/ROOT/test@today

and no help. ok, but i did got it fixed on first dataset - rpool/ROOT/oi:
grub-core/fs/zfs/zfs.c:3870: read blksz 100000
grub-core/fs/zfs/zfs.c:3894: blkid: 24 dn_maxblkid: 25
grub-core/fs/zfs/zfs.c:3894: blkid: 25 dn_maxblkid: 25

so... is it recordsize issue?:
root@test:/home/tsoome# zfs set recordsize=1M rpool/ROOT/test
root@test:/home/tsoome# mount -F zfs rpool/ROOT/test /mnt
root@test:/home/tsoome# cp /platform/i86pc/amd64/boot_archive /mnt/platform/i86pc/amd64/boot_archive

and no help. so I did remove the target file and ....
grub-core/fs/zfs/zfs.c:3870: read blksz 100000
grub-core/fs/zfs/zfs.c:3894: blkid: 24 dn_maxblkid: 24
grub-core/fs/zfs/zfs.c:3894: blkid: 25 dn_maxblkid: 24

so, its even not related to send itself. it does not make sense at all now:D

#3

Updated by Toomas Soome about 5 years ago

I just did another test as well:

root@test:/home/tsoome# zpool create tank c3t2d0
root@test:/home/tsoome# beadm create -p tank illumos
Created successfully

and ./grub-fstest -d zfs -r loop0 /dev/rdsk/c3t2d0s0 cmp /ROOT/illumos/@/platform/i86pc/amd64/boot_archive /platform/i86pc/amd64/boot_archive

grub-core/fs/zfs/zfs.c:3870: read blksz 20000
grub-core/fs/zfs/zfs.c:3894: blkid: 128 dn_maxblkid: 127
grub-core/fs/zfs/zfs.c:3894: blkid: 129 dn_maxblkid: 127

so, default pool, beadm create with sending data to new pool, and same result.

tank feature@async_destroy enabled local
tank feature@empty_bpobj active local
tank feature@lz4_compress active local
tank feature@multi_vdev_crash_dump enabled local
tank feature@spacemap_histogram active local
tank feature@enabled_txg active local
tank feature@hole_birth active local
tank feature@extensible_dataset enabled local
tank feature@embedded_data active local
tank feature@bookmarks enabled local
tank feature@filesystem_limits enabled local
tank feature@large_blocks enabled local

#4

Updated by Toomas Soome about 5 years ago

it seems the problem may be related to larger files (like those ~36MB bootarchives etc) - i haven't seen this issue with smaller ones, also it is not related to zfs send; got the same result from "cp boot_archive /" (bn 0x12a versus dn_maxblkid 0x128 in this particular case).

Also available in: Atom PDF