Bug #7367
closedblkdev: support block size larger than 512
100%
Description
It seems the current blkdev driver assumes the block size is 512, although it has the right shift value available as (bd_t*)->d_blkshift.
We tried a patch that calculates the block number and number of blocks based on the d_blkshift value, it worked for a zpool full of NVMe SSDs which are formatted to be 4k block size.
However, if a 4k device is used as zpool cache drive, it won't work, because zfs does not respect the ashift value of the drive, and sends I/O requests of size = n*512 to the device.
Files
Updated by Youzhong Yang over 6 years ago
We've tested the patch by using the following scripts:
# cat send-recv #!/bin/bash action=$1 if [ "$action" == "send" ]; then snapshot=$2 host=$3 port=$4 loop=$5 while true; do zfs send -R $snapshot | mbuffer -O ${host}:${port} -m 4g -s 2048k if [ "$loop" != "1" ]; then break fi echo sleeping 120 seconds .... sleep 120 done elif [ "$action" == "recv" ]; then port=$2 zfs=$3 loop=$4 while true; do if [ "$loop" == "1" ]; then zfs destroy -r -R $zfs fi mbuffer -I ${port} -m 4g -s 2048k | zfs recv -Fv $zfs if [ "$loop" != "1" ]; then break fi sleep 5 done else echo Usage: echo $0 send snapshot host port loop echo $0 recv port zfs loop echo "" fi
which does zfs/send of a few hundred GiB of data. After zfs recv, we did zpool scrub and it reported no error.
# cat run-dds #!/bin/bash for i in `seq 1 1000`; do dd if=/dev/zero of=file00 bs=1M count=102400 oflag=sync & dd if=/dev/zero of=file01 bs=1M count=102400 oflag=sync & dd if=/dev/zero of=file02 bs=1M count=102400 oflag=sync & dd if=/dev/zero of=file03 bs=1M count=102400 oflag=sync & wait rm file00 file01 file02 file03 done
which is for NVMe benchmarking.
The patch has been tested for both 512B block size and 4K block size.
Updated by Youzhong Yang over 6 years ago
Additional info - two systems were used for the testing:
Baseboard: Supermicro X10DRU-i+ (System SYS-1028U-TN10RT+)
Memory: 768G
CPU: 2 x Intel(R) Xeon(R) CPU E5-2697A v4 @ 2.60GHz
NVMe SSDs: 4 x Intel DC P3700 2TB (SSDPE2MD020T4)
Baseboad: Supermicro X10DRU-i+ (System SYS-2028U-TN24R4T+)
Memory: 768G
CPU: 2 x Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz
NVMe SSDs: 2 x Intel DC P3700 800GB (SSDPE2MD800G4)
NVMe SSDs: 22 x Intel DC P3600 400GB (SSDPE2ME400G4)
Updated by Electric Monk over 6 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit d2c5b266b717b923ea0e28b925ddb8e66dd98b42
commit d2c5b266b717b923ea0e28b925ddb8e66dd98b42 Author: Youzhong Yang <yyang@mathworks.com> Date: 2017-01-25T22:09:30.000Z 7367 blkdev: support block size larger than 512 Reviewed by: Garrett D'Amore <garrett@damore.org> Reviewed by: Hans Rosenfeld <hans.rosenfeld@nexenta.com> Approved by: Dan McDonald <danmcd@omniti.com>
Updated by Youzhong Yang over 6 years ago
Commit d2c5b266 was backed out.
New webrev here for review:
http://cr.illumos.org/~webrev/yyang/7367-1/
A test program is attached.
Updated by Electric Monk over 6 years ago
git commit c0591a0ce5e26f7f32f7f6e8ae0ca4193cd2e50e
commit c0591a0ce5e26f7f32f7f6e8ae0ca4193cd2e50e Author: Youzhong Yang <yyang@mathworks.com> Date: 2017-02-27T13:17:14.000Z 7367 blkdev: support block size larger than 512 Reviewed by: Hans Rosenfeld <hans.rosenfeld@nexenta.com> Reviewed by: Robert Mustacchi <rm@joyent.com> Reviewed by: Garrett D'Amore <garrett@damore.org> Approved by: Dan McDonald <danmcd@omniti.com>
Updated by Electric Monk over 6 years ago
git commit 04904ca2a4492f1b3e2ec393f82d81a9a1c9611e
commit 04904ca2a4492f1b3e2ec393f82d81a9a1c9611e Author: Dan McDonald <danmcd@omniti.com> Date: 2017-01-26T17:02:38.000Z backout: 7367 blkdev: support block size larger than 512 (Needs more work.) This reverts commit d2c5b266b717b923ea0e28b925ddb8e66dd98b42.
Updated by Joshua M. Clulow almost 5 years ago
Note that the automated commit notifications are out-of-order in the issue comments. Despite immediate appearances the change was integrated, backed out, then finally integrated -- as reflected in the Date:
headers above.