Project

General

Profile

Bug #13286

bhyve ins/outs emulation misuses %rax

Added by Patrick Mooney 4 months ago. Updated 4 months ago.

Status:
Closed
Priority:
Normal
Category:
bhyve
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
bhyve
Gerrit CR:

Description

I was debugging a guest issue where a data transfer to an emulated device via repz insb appeared to be performing one fewer transfers than expected. In this specific case, it was expected to be copying an ID string "QEMU", but the copy was completing with "EMU\0". From the perspective of userspace, where this device was being emulated, the 4 inb transfers were occurring with the expected results. It was somewhere inside the kernel logic that the handling was improper. For ins/outs operations, the bhyve instruction emulation logic uses the non-str logic to perform the actual transfer:

        if (!in) {
                vm_copyin(vm, vcpuid, copyinfo, &vie->inout.eax, bytes);
        }

        err = vie_emulate_inout_port(vie, vm, vcpuid);

        if (err == 0 && in) {
                vm_copyout(vm, vcpuid, &vie->inout.eax, copyinfo, bytes);
        }

Where vie_emulate_inout_port will attempt to perform the access via in-kernel emulation, falling back to a transfer request (and fulfillment) via userspace if required. It's notable that it doesn't return or set the value of the transfer (eax) but expects it to be properly populated on entry/return. I was drawn to look closer at it after dtracing indicated that inout.eax was not being updated as expected after the userspace fulfillment:
        if (!in) {
                val = vie->inout.eax & mask;
        }

        if (vie->inout_req_state != VR_DONE) {
                err = vm_ioport_access(vm, vcpuid, in, vie->inout.port,
                    vie->inout.bytes, &val);
        } else {
                /*
                 * This port access was handled in userspace and the result was
                 * injected in to be handled now.
                 */
                val = vie->inout_req_val;
                vie->inout_req_state = VR_NONE;
                err = 0;
        }

        if (err == ESRCH) {
                vie->status |= VIES_PENDING_INOUT;
                vie->inout_req_state = VR_PENDING;
                return (err);
        } else if (err != 0) {
                return (err);
        }

        if (in) {
                val &= mask;
                val |= (vie->inout.eax & ~mask);
                err = vm_set_register(vm, vcpuid, VM_REG_GUEST_RAX, val);
                KASSERT(err == 0, ("emulate_ioport: error %d setting guest " 
                    "rax register", err));
        }

Looking specifically at the code-path for a successful in operation, the problem now becomes more clear. Because this is the non-str logic, it's simply setting rax directly with the result, leaving inout.eax alone. While that's correct for a simple non-str in/out, it's a problem for ins:
1. The transferred value is never placed in inout.eax, meaning it will not be copied out during the vie_emulate_inout_str. Instead, it's left in the guest rax, only to be placed in inout.eax by happenstance during the next round of emulation.
2. The ins instructions are not expected to manipulate %rax at all.

If the logic in vie_emulate_inout_port is to be reused in this fashion, it should be made to act on an explicit intermediate value, rather than storing to the guest %rax. This would correct the transfer result for ins, and avoid manipulating %rax unless the instruction calls for it.

#1

Updated by Electric Monk 4 months ago

  • Gerrit CR set to 1044
#2

Updated by Patrick Mooney 4 months ago

This was tested with that experimental bootrom payload, which seems to be one of the few users of ins in such a manner. Prior to the change, it would exhibit the symptoms described in the ticket. After the change, reads of various sizes (repeat counts in rcx) are show progressing as expected, with the results making their way correctly into guest memory.

#3

Updated by Patrick Mooney 4 months ago

In addition to the specific testing (with the case that was previously failing), I performed the usual smoke-test across a wide range of guests. Considering that they all booted and ran fine prior to this fix, it was no surprise to see their behavior remain the same with it in place.

#4

Updated by Electric Monk 4 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 80 to 100

git commit 04573c73a7ab1505c46b2c4db26bfde5176dd6a5

commit  04573c73a7ab1505c46b2c4db26bfde5176dd6a5
Author: Patrick Mooney <pmooney@pfmooney.com>
Date:   2020-11-13T02:45:02.000Z

    13286 bhyve ins/outs emulation misuses %rax
    Reviewed by: Robert Mustacchi <rm@fingolfin.org>
    Reviewed by: Toomas Soome <tsoome@me.com>
    Approved by: Gordon Ross <gordon.w.ross@gmail.com>

Also available in: Atom PDF