`VcpuTasks::pause_all` is not guaranteed to pause all vCPUs successfully

Each vCPU loop has an associated Propolis `Task` that is used to tell the vCPU execution loop to pause or exit in response to some kind of event. For example, a request to reset the VM tells all vCPU threads' tasks to pause. To handle the case where a vCPU is in guest context when a pause event arrives, each vCPU `TaskHdl` is given an associated barrier function (in `VcpuTasks::new`) that will try to evict the vCPU from the guest after the task is marked as paused.

The barrier function currently tries to read a guest register. This will indeed cause an exit if the vCPU is in the guest. But if the vCPU is *about to enter* the guest, this poke will be missed, and the vCPU task won't get a chance to pause until the next time it exits. That may never happen; for example, in #559, a system handled two triple-fault reset events and then wedged after logging the following:

```
16:47:00.464Z INFO propolis-server (vcpu_tasks): vCPU released from hold
    vcpu = 0
16:47:00.464Z INFO propolis-server (vm_state_worker): State worker handled event
    outcome = Continue
16:47:00.464Z INFO propolis-server (vm_state_worker): State worker handling event
    event = Guest(VcpuSuspendTripleFault(3))
16:47:00.464Z INFO propolis-server (vm_state_worker): Resetting due to triple fault on vCPU 3
16:47:00.464Z INFO propolis-server (vm_state_worker): Resetting instance
16:47:00.464Z INFO propolis-server (vcpu_tasks): vCPU released from hold
    vcpu = 2
16:47:00.464Z INFO propolis-server (vcpu_tasks): vCPU released from hold
    vcpu = 3
16:47:00.464Z INFO propolis-server (vcpu_tasks): vCPU released from hold
    vcpu = 1
16:47:00.464Z INFO propolis-server (vcpu_tasks): vCPU paused
    vcpu = 0
```

I suspect what has happened here is that

- all four vCPUs were resumed after handling a triple fault reset
- vCPU 0 entered the guest but the other vCPU threads did not
- the state driver dequeued another triple fault reset event (see https://github.com/oxidecomputer/propolis/issues/559#issuecomment-1785855762)
- the state driver asked to pause all four vCPU tasks
- the vCPU barrier function successfully kicked vCPU 0 out of the guest and caused it to pause...
- ...but the barrier function had no effect on vCPUs 1-3, because they haven't entered the guest yet
- vCPUs 1-3 enter the guest and never exit, because they were just reset and are waiting for init interrupts from vCPU 0, which is paused

To fix this, we (probably) need a more reliable way to tell the kernel VMM that the next attempt to enter the guest should exit immediately, so that the state driver can be sure that when it asks to pause vCPUs, they will evaluate their task states at least once more, irrespective of what they're doing when the pause request arrives.

Note that this is orthogonal to two other related problems stemming from #559:

- bhyve should provide enough information in VM_SUSPEND exits to allow Propolis to queue only a single event on triple-fault (see the above-linked comment)
- Propolis should do a better job of deduplicating/discarding events that came from old "generations" of a VM (i.e. when a system is reset, vCPU events from before the reset should be discarded)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`VcpuTasks::pause_all` is not guaranteed to pause all vCPUs successfully #561

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

VcpuTasks::pause_all is not guaranteed to pause all vCPUs successfully #561

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`VcpuTasks::pause_all` is not guaranteed to pause all vCPUs successfully #561