Some operations performed by Xenopsd are blocking, for example:
We want to be able to
A task has a state, which may be Pending, Completed or failed:
type async_result = unit
type completion_t = {
duration : float;
result : async_result option
}
type state =
| Pending of float
| Completed of completion_t
| Failed of Rpc.t
When a task is Failed, we assocate it with a marshalled exception (a value of type Rpc.t). This exception must be one from the set defined in the Xenops_interface. To see how they are marshalled, see Xenops_server.
From the point of view of a client, a Task has the immutable type (which can be
queried with a Task.stat
):
type t = {
id: id;
dbg: string;
ctime: float;
state: state;
subtasks: (string * state) list;
debug_info: (string * string) list;
}
where
Internally, Xenopsd uses a mutable record type to track Task state. This is broadly similar to the interface type except
The Tasks are intended to represent activities associated with in-memory queues and threads. Therefore the active Tasks are kept in memory in a map, and will be lost over a process restart. This is desirable since we will also lose the queued items and the threads, so there is no need to resync on start.
Note that every operation must ensure that the state of the system is recoverable on restart by not leaving it in an invalid state. It is not necessary to either guarantee to complete or roll-back a Task. Tasks are not expected to be transactional.
All Tasks returned by API functions are created as part of the enqueue functions: queue_operation_*. Even operations which are performed internally are normally wrapped in Tasks by the function immediate_operation.
A queued operation will be processed by one of the queue worker threads. It will
task.Xenops_task.run
, taking care to catch exceptions and update
the task.Xenops_task.state
Task implementations must update their progress as they work. For the common
case of a compound operation like VM_start
which is decomposed into
multiple “micro-ops” (e.g. VM_create
VM_build
) there is a useful
helper function
perform_atomics
which divides the progress ‘bar’ into sections, where each “micro-op” can have
a different size (weight
). A progress callback function is passed into
each Xenopsd backend function so it can be updated with fine granulatiry. For
example note the arguments to
B.VM.save
Clients are expected to destroy Tasks they are responsible for creating. Xenopsd cannot do this on their behalf because it does not know if they have successfully queried the Task status/result.
When Xenopsd is a client of itself, it will take care to destroy the Task properly, for example see immediate_operation.
The goal of cancellation is to unstick a blocked operation and to return the system to some valid state, not any valid state in particular. Xenopsd does not treat operations as transactions; when an operation is cancelled it may
Xenopsd will never leave the system in an invalid state after cancellation.
Every Xenopsd operation should unblock and return the system to a valid state within a reasonable amount of time after a cancel request. This should be as quick as possible but up to 30s may be acceptable. Bear in mind that a human is probably impatiently watching a UI say “please wait” and which doesn’t have any notion of progress itself. Keep it quick!
Cancellation is triggered by TASK.cancel which calls cancel. This
Implementations respond to cancellation by
Xenopsd’s libxc backend can block in 2 different ways, and therefore has 2 different types of cancel callback:
Xenstore watches are used for device hotplug and unplug. Xenopsd has to wait for
the backend or for a udev script to do something. If that blocks then we need
a way to cancel the watch. The easiest way to cancel a watch is to watch an
additional path (a “cancel path”) and delete it, see
cancellable_watch.
The “cancel paths” are placed within the VM’s Xenstore directory to ensure that
cleanup code which does xenstore-rm
will automatically “cancel” all outstanding
watches. Note that we trigger a cancel by deleting rather than creating, to avoid
racing with delete and creating orphaned Xenstore entries.
Subprocesses are used for suspend/resume/migrate. Xenopsd hands file descriptors to libxenguest by running a subprocess and passing the fds to it. Xenopsd therefore gets the process id and can send it a signal to cancel it. See Cancellable_subprocess.run.
Cancellation is difficult to test, as it is completely asynchronous. Therefore
Xenopsd has some built-in cancellation testing infrastructure known as “cancel points”.
A “cancel point” is a point in the code where a Cancelled
exception could
be thrown, either by checking the cancelling boolean or as a side-effect of
a cancel callback. The
check_cancelling
function increments a counter every time it passes one of these points, and
this value is returned to clients in the
Task.debug_info.
A test harness runs a series of operations. Each operation is first run all the way through to completion to discover the total number of cancel points. The operation is then re-run with a request to cancel at a particular point. The test then waits for the system to stabilise and verifies that it appears to be in a valid state.
The client who creates a Task must destroy it when the Task is finished, and they have processed the result. What if a client like xapi is restarted while a Task is running?
We assume that, if xapi is talking to a xenopsd, then xapi completely owns it. Therefore xapi should destroy any completed tasks that it doesn’t recognise.
If a user wishes to manage VMs with xenopsd in parallel with xapi, the user should run a separate xenopsd.