Bug #5905
openwrong dn_maxblkid on dnode received to large_block dataset
0%
Description
I have cloned the dataset recordsize=128k to another pool and new recordsize is 1M, I think I received to 1M dataset.
while checking file:rw-r--r- 1 root root 39053312 apr 16 13:57 /platform/i86pc/amd64/boot_archive
with bootloader standalone zfs reader, i have found the dn_maxblkid is wrong in received file (values are in hex):
grub-core/fs/zfs/zfs.c:3870: read blksz 20000
grub-core/fs/zfs/zfs.c:3894: blkid: 128 dn_maxblkid: 127
grub-core/fs/zfs/zfs.c:3894: blkid: 129 dn_maxblkid: 127
while original has correct value:
grub-core/fs/zfs/zfs.c:3870: read blksz 20000
grub-core/fs/zfs/zfs.c:3894: blkid: 128 dn_maxblkid: 129
grub-core/fs/zfs/zfs.c:3894: blkid: 129 dn_maxblkid: 129
so, I did re-generate this boot_archive, and new one has also correct dn_maxblkid - 129.
note, neither grub-0.97 in illumos nor grub2 does not check blkid against dn_maxblkid, however the freebsd loader does make sure blkid is not greater than dn_maxblkid.
Updated by Matthew Ahrens about 8 years ago
I'm having a little trouble understanding the steps you took to get into this situation. Can you specify which zfs commands you ran, and their arguments?
Updated by Toomas Soome about 8 years ago
Matthew Ahrens wrote:
I'm having a little trouble understanding the steps you took to get into this situation. Can you specify which zfs commands you ran, and their arguments?
apparently, the large_block may even not be related; the source DS:
raid/ROOT/test recordsize 128K default
- zfs snapshot raid/ROOT/test@today
- zfs send raid/ROOT/test@today | zfs receive -e rpool/ROOT
now the received DS has same recordsize:
rpool/ROOT/test recordsize 128K default
- ./grub-fstest -r loop0 /dev/rdsk/c3t0d0s0 cmp /ROOT/test/@/platform/i86pc/amd64/boot_archive /platform/i86pc/amd64/boot_archive && echo OK
OK
does compare work? - ./grub-fstest -r loop0 /dev/rdsk/c3t0d0s0 cmp /ROOT/test/@/platform/i86pc/amd64/boot_archive /platform/i86pc/boot_archive && echo OK
./grub-fstest: error: compare fail at offset 32848.
and from debug log:
grub-core/fs/zfs/zfs.c:3870: read blksz 20000
grub-core/fs/zfs/zfs.c:3894: blkid: 128 dn_maxblkid: 127
grub-core/fs/zfs/zfs.c:3894: blkid: 129 dn_maxblkid: 127
so, even this trivial send | receive did reproduce it.
does same happens on kernel itself:
grub-core/fs/zfs/zfs.c:3870: read blksz 20000
grub-core/fs/zfs/zfs.c:3894: blkid: d dn_maxblkid: e
grub-core/fs/zfs/zfs.c:3894: blkid: e dn_maxblkid: e
nope it does not, but then again, it may be related to size of file. ok, lets rewrite the file:
root@test:/home/tsoome# mount -F zfs rpool/ROOT/test /mnt
root@test:/home/tsoome# cp /platform/i86pc/amd64/boot_archive /mnt/platform/i86pc/amd64/boot_archive
root@test:/home/tsoome# umount /mnt
nop, it did not help:
grub-core/fs/zfs/zfs.c:3894: blkid: 129 dn_maxblkid: 127
ok, remove and create new:
root@test:/home/tsoome# mount -F zfs rpool/ROOT/test /mnt
root@test:/home/tsoome# rm /mnt/platform/i86pc/amd64/boot_archive
root@test:/home/tsoome# cp /platform/i86pc/amd64/boot_archive /mnt/platform/i86pc/amd64/boot_archive
root@test:/home/tsoome# umount /mnt
hm, still same:
grub-core/fs/zfs/zfs.c:3894: blkid: 129 dn_maxblkid: 127
ok, but I still have snapshot on it; destroy it:
root@test:/home/tsoome# zfs destroy rpool/ROOT/test@today
and no help. ok, but i did got it fixed on first dataset - rpool/ROOT/oi:
grub-core/fs/zfs/zfs.c:3870: read blksz 100000
grub-core/fs/zfs/zfs.c:3894: blkid: 24 dn_maxblkid: 25
grub-core/fs/zfs/zfs.c:3894: blkid: 25 dn_maxblkid: 25
so... is it recordsize issue?:
root@test:/home/tsoome# zfs set recordsize=1M rpool/ROOT/test
root@test:/home/tsoome# mount -F zfs rpool/ROOT/test /mnt
root@test:/home/tsoome# cp /platform/i86pc/amd64/boot_archive /mnt/platform/i86pc/amd64/boot_archive
and no help. so I did remove the target file and ....
grub-core/fs/zfs/zfs.c:3870: read blksz 100000
grub-core/fs/zfs/zfs.c:3894: blkid: 24 dn_maxblkid: 24
grub-core/fs/zfs/zfs.c:3894: blkid: 25 dn_maxblkid: 24
so, its even not related to send itself. it does not make sense at all now:D
Updated by Toomas Soome about 8 years ago
I just did another test as well:
root@test:/home/tsoome# zpool create tank c3t2d0
root@test:/home/tsoome# beadm create -p tank illumos
Created successfully
and ./grub-fstest -d zfs -r loop0 /dev/rdsk/c3t2d0s0 cmp /ROOT/illumos/@/platform/i86pc/amd64/boot_archive /platform/i86pc/amd64/boot_archive
grub-core/fs/zfs/zfs.c:3870: read blksz 20000
grub-core/fs/zfs/zfs.c:3894: blkid: 128 dn_maxblkid: 127
grub-core/fs/zfs/zfs.c:3894: blkid: 129 dn_maxblkid: 127
so, default pool, beadm create with sending data to new pool, and same result.
tank feature@async_destroy enabled local
tank feature@empty_bpobj active local
tank feature@lz4_compress active local
tank feature@multi_vdev_crash_dump enabled local
tank feature@spacemap_histogram active local
tank feature@enabled_txg active local
tank feature@hole_birth active local
tank feature@extensible_dataset enabled local
tank feature@embedded_data active local
tank feature@bookmarks enabled local
tank feature@filesystem_limits enabled local
tank feature@large_blocks enabled local
Updated by Toomas Soome about 8 years ago
it seems the problem may be related to larger files (like those ~36MB bootarchives etc) - i haven't seen this issue with smaller ones, also it is not related to zfs send; got the same result from "cp boot_archive /" (bn 0x12a versus dn_maxblkid 0x128 in this particular case).