Feature #13380
closedAdd virtio-9p (aka VirtFS) filesystem sharing to bhyve
100%
Description
FreeBSD bhyve has this feature as of this commit.
commit 8294a283a164da94cf7c7681404c1994754565d9 Author: jceel <jceel@FreeBSD.org> Date: Sat Oct 3 19:05:13 2020 +0000 Add virtio-9p (aka VirtFS) filesystem sharing to bhyve.
Bringing this into illumos bhyve involves first porting the lib9p library ( https://github.com/conclusiveeng/lib9p ).
Related issues
Updated by Andy Fiddaman over 2 years ago
- Status changed from New to In Progress
- Assignee set to Andy Fiddaman
- % Done changed from 0 to 80
Updated by Andy Fiddaman about 2 years ago
- Blocks Feature #14081: bhyve upstream sync 2021 September added
Updated by Jorge Schrauwen about 2 years ago
Andy asked for a bit of a explanation how I am using virtio-9p on OmniOS currently.
So for normal zones (lipkg, pkgsrc, ...) I delegate a dataset from rpool/vmdata/<name> to the zone, which then can create child datasets for all actual data (and configuration)
e.g. my docs zone's dataset (runs httpd + hugo + git repo)
root@jupiter:~# zfs list -r -o name,used,avail,refer,zoned,mountpoint rpool/vmdata/docs NAME USED AVAIL REFER ZONED MOUNTPOINT rpool/vmdata/docs 1015M 4.01G 192K on none rpool/vmdata/docs/git 248K 128M 248K on /srv/git/home rpool/vmdata/docs/httpd 1.52M 4.01G 952K on /opt/httpd rpool/vmdata/docs/repos 890M 4.01G 208K on /srv/git/repo rpool/vmdata/docs/repos/base_config.git 2.76M 4.01G 2.76M on /srv/git/repo/base_config.git rpool/vmdata/docs/repos/docs.git 887M 4.01G 884M on /srv/git/repo/docs.git rpool/vmdata/docs/www 123M 4.01G 192K on /srv/www rpool/vmdata/docs/www/docs.acheron.be 123M 4.01G 120M on /srv/www/docs.acheron.be
This makes it relatively easy to backup using zrepl and also rebuild the zone if need be as I can just restore the dataset tree somewhere and install a few applications an I am good to go. This also allows me to do file based restores from snapshots! Which is very handy.
When switching to OmniOS I also switched 2 of my LX zones over to Bhyve VM's running ubuntu 20.04, were I do something similar.
root@jupiter:~# zfs list -r -o name,used,avail,refer,zoned,mountpoint rpool/vmdata/ares NAME USED AVAIL REFER ZONED MOUNTPOINT rpool/vmdata/ares 31.8M 76.0G 192K on /vol rpool/vmdata/ares/httpd 1.22M 76.0G 640K on /vol/httpd rpool/vmdata/ares/radarr 29.6M 994M 25.1M on /vol/radarr rpool/vmdata/ares/sonarr 192K 1024M 192K on /vol/sonarr rpool/vmdata/ares/www 592K 76.0G 272K on /vol/www rpool/vmdata/ares/www/ares.acheron.be 192K 76.0G 192K on /vol/www/ares.acheron.be
I pass rpool/vmdata/ares to the vm and then pass 4 virtio-9p filesystems to the VM.
To make it easy for my self I created rpool/vmdata/ares like this: zfs create -o quota=25G -o mountpoint=/vol -o zoned=on rpool/vmdata/ares
I can then simply create 'volumes' by creating a dataset under rpool/vmdata/ares and it will end up mounted in /vol/<dsname> which I then pass to the vm e.g. rpool/vmdata/ares/httpd gets passed as httpd,/vol/httpd.
Inside the VM have the following my my fstab:
httpd /opt/httpd 9p rw,relatime,dirsync,uname=root,cache=mmap,access=client,trans=virtio,_netdev 0 0
Which similar to my docs zone holds all my httpd configuration data, actual vhost data lives on /vol/www which is mounted the same way but under /srv/www.
This gets me a similar flow to zones wrt data and config backups which is a huge improvement over adding a 2nd zvol as a disk and then formatting it inside the VM.
Granted performance is adequate it's slower than using nfs4 but overall much much more convenient and I don't have to deal with uid/gid mapping and networking side of things.
I hope that explains a bit the how and mostly why I am using virtio-9p over say an extra zvol or nfs.
Updated by Andy Fiddaman about 2 years ago
In addition to the testing that Jorge has been doing, I have additionally done some soak/load testing and targeted tests.
All of my testing has been in a Debian 11 guest using its 9mount
package to mount the share.
For the load testing, I checked out the Linux kernel source onto a 9p-backed mountpoint and built the kernel in a loop for 24 hours with no obvious problems (the build completed successfully). It was not fast, but it worked.
I then went through and tested each backend call in turn using filesystem commands, checking the files on the backend dataset from the global zone to observe the changes which had been made, where appropriate. For example, comparing the output of the stat command from within and outside:
root@bhyvetest:/a# stat fred File: fred Size: 0 Blocks: 1 IO Block: 131072 regular empty file Device: 24h/36d Inode: 5 Links: 1 Access: (0666/-rw-rw-rw-) Uid: ( 1000/ debian) Gid: ( 0/ root) Access: 2021-10-05 10:47:59.668927000 +0000 Modify: 2021-10-05 10:47:59.668927180 +0000 Change: 2021-10-05 10:48:12.615647951 +0000 Birth: - theeo# stat fred File: fred Size: 0 Blocks: 1 IO Block: 131072 regular empty file Device: 10a00010034h/1142461366324d Inode: 3 Links: 1 Access: (0666/-rw-rw-rw-) Uid: ( 1000/ UNKNOWN) Gid: ( 0/ root) Access: 2021-10-05 10:47:59.668927000 +0000 Modify: 2021-10-05 10:47:59.668927180 +0000 Change: 2021-10-05 10:48:12.615647951 +0000 Birth: 2021-10-05 10:47:59.667717963 +0000
For operations such as statvfs, I also compared results from within and outside the guest
root@bhyvetest:/a# df -h . Filesystem Size Used Avail Use% Mounted on womble 6.4T 73M 6.4T 1% /a theeo# df -h . Filesystem Size Used Available Capacity Mounted on data/womble 10.57T 72.48M 6.37T 1% /zones/bhyvetest/root/womble
Updated by Andy Fiddaman about 2 years ago
Some more details on the testing from the RTI discussion:
When I first started working on this I went through all VFS calls and checked they were working, and fixed a number of bugs there. Not all of the implemented ones are supported by the Debian driver that I'm using in the guest so I also used FreeBSD during that initial work. I also did some additional testing that when the device is backed by a dataset then setting 'setuid' and 'devices' to off on there had the right effect in relation to what the guest sees. Similarly for a lofs mount with the nodevices option.
The 9p extended attribute operations (xattrwalk and xattrcreate) are currently not supported by this implementation.
Updated by Electric Monk about 2 years ago
- Status changed from In Progress to Closed
- % Done changed from 80 to 100
git commit aa693e996c2928c92cccd8a3efe91373e85a6967
commit aa693e996c2928c92cccd8a3efe91373e85a6967 Author: Jason King <jason.brian.king@gmail.com> Date: 2021-10-07T09:11:03.000Z 13380 Add virtio-9p (aka VirtFS) filesystem sharing to bhyve Portions contributed by: Andy Fiddaman <andy@omnios.org> Reviewed by: Jason King <jason.brian.king@gmail.com> Reviewed by: Jorge Schrauwen <sjorge@blackdot.be> Approved by: Robert Mustacchi <rm@fingolfin.org>