Project

General

Profile

Actions

Bug #13981

closed

bhyve emulation should set dirty bits

Added by Patrick Mooney almost 2 years ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Category:
bhyve
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:
External Bug:

Description

With #13932 landed and #13896 on the way, bhyve should be able to set dirty bits in the nested page tables when accessing them for device emulation. While the CPU will set those dirty bits memory writes from guest context (provided the hardware supports it), emulated devices, both in the kernel and userland, need their writes to guest memory to be accounted for as well. This is necessary for any semblance of correctness when it comes to assessing pages which need to be copied as part of a VM migration.


Related issues

Related to illumos gate - Feature #13932: improve bhyve second level page table supportClosed

Actions
Related to illumos gate - Bug #13896: bhyve VM interfaces should be better fitClosedPatrick Mooney

Actions
Actions #1

Updated by Patrick Mooney almost 2 years ago

  • Related to Feature #13932: improve bhyve second level page table support added
Actions #2

Updated by Patrick Mooney almost 2 years ago

  • Related to Bug #13896: bhyve VM interfaces should be better fit added
Actions #3

Updated by Patrick Mooney almost 2 years ago

Cobbled together a test program to explore edge cases in the unmap/invalidate case:

#include <stdio.h>
#include <fcntl.h>
#include <assert.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <sys/mman.h>

#define VM_MAX_NAMELEN          128
#define VM_MAX_SEG_NAMELEN      128

struct vm_create_req {
        char            name[VM_MAX_NAMELEN];
        uint64_t        flags;
};
struct vm_destroy_req {
        char            name[VM_MAX_NAMELEN];
};

struct vm_memmap {
        uint64_t        gpa;
        int             segid;          /* memory segment */
        int64_t         segoff;         /* offset into memory segment */
        size_t          len;            /* mmap length */
        int             prot;           /* RWX */
        int             flags;
};

struct vm_munmap {
        uint64_t        gpa;
        size_t          len;
};

struct vm_memseg {
        int             segid;
        size_t          len;
        char            name[VM_MAX_SEG_NAMELEN];
};
#define VMMCTL_IOC_BASE         (('V' << 16) | ('M' << 8))
#define VMM_IOC_BASE            (('v' << 16) | ('m' << 8))
#define VMM_LOCK_IOC_BASE       (('v' << 16) | ('l' << 8))
#define VMM_CPU_IOC_BASE        (('v' << 16) | ('p' << 8))

#define VMM_CREATE_VM           (VMMCTL_IOC_BASE | 0x01)
#define VMM_DESTROY_VM          (VMMCTL_IOC_BASE | 0x02)
#define VMM_VM_SUPPORTED        (VMMCTL_IOC_BASE | 0x03)

#define VM_ALLOC_MEMSEG         (VMM_LOCK_IOC_BASE | 0x05)
#define VM_MMAP_MEMSEG          (VMM_LOCK_IOC_BASE | 0x06)
#define VM_PMTMR_LOCATE         (VMM_LOCK_IOC_BASE | 0x07)
#define VM_MUNMAP_MEMSEG        (VMM_LOCK_IOC_BASE | 0x08)

/*
dtrace -n 'segvmm_invalidate:entry { printf("%lu %s %x %x %x\n", (unsigned long)timestamp, probefunc, arg0, arg1, arg2); self->t = 1 } hat_unload:entry /self->t/ { printf("%lu %s %x %x\n", (unsigned long)timestamp, probefunc, arg1, arg2) } segvmm_invalidate:return { self->t = 0 }'
*/

#define TEST_VM_NAME "maptest" 

int
main(int argc, char **argv)
{
        int res;

        int ctlfd = open("/dev/vmmctl", O_EXCL | O_RDWR);
        assert(ctlfd != -1);

        struct vm_destroy_req req_destroy;
        strcpy(req_destroy.name, TEST_VM_NAME);
        (void) ioctl(ctlfd, VMM_DESTROY_VM, &req_destroy);

        struct vm_create_req req_create;
        req_create.flags = 0;
        strcpy(req_create.name, TEST_VM_NAME);
        res = ioctl(ctlfd, VMM_CREATE_VM, &req_create);
        assert(res == 0);
        (void) close(ctlfd);
        int vmfd = open("/dev/vmm/" TEST_VM_NAME, O_EXCL | O_RDWR);
        assert(vmfd != -1);

        struct vm_memseg seg = {
                .segid = 0,
                .len = 0x200000,
        };
        res = ioctl(vmfd, VM_ALLOC_MEMSEG, &seg);
        assert(res == 0);

        struct vm_memmap map = {
                .gpa = 0,
                .segid = 0,
                .segoff = 0,
                .len = 0x200000,
                .prot = PROT_READ | PROT_WRITE,
        };
        res = ioctl(vmfd, VM_MMAP_MEMSEG, &map);
        assert(res == 0);

        uint64_t *data = (uint64_t *)mmap(NULL, 0x100000000, PROT_READ|PROT_WRITE, MAP_SHARED, vmfd, 0);
        assert(data != NULL);

        assert(data[0] == 0);
        assert(data[0x1000] == 0);

        struct vm_munmap unmap = {
                .gpa =  0,
                .len = 0x200000,
        };
        res = ioctl(vmfd, VM_MUNMAP_MEMSEG, &unmap);
        assert(res == 0);

        printf("should SIGSEGV now\n");
        fflush(stdout);

        assert(data[0] != 0);
}

Actions #4

Updated by Electric Monk over 1 year ago

  • Gerrit CR set to 1563
Actions #5

Updated by Patrick Mooney over 1 year ago

The primary test notes for this are included in #13896.

Actions #6

Updated by Electric Monk over 1 year ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

git commit 0153d828c132fdb1a17c11b99386a3d1b87994cf

commit  0153d828c132fdb1a17c11b99386a3d1b87994cf
Author: Patrick Mooney <pmooney@pfmooney.com>
Date:   2021-11-19T23:00:59.000Z

    13896 bhyve VM interfaces should be better fit
    13981 bhyve emulation should set dirty bits
    Reviewed by: Dan Cross <cross@oxidecomputer.com>
    Reviewed by: Joshua M. Clulow <josh@sysmgr.org>
    Approved by: Dan McDonald <danmcd@joyent.com>

Actions

Also available in: Atom PDF