|v2||Added details about the VDI's binary format and size, and the SR capability name.|
|v3||Tar was not needed after all!|
|v4||Add details about discovering the VDI using a new vdi_type.|
|v5||Add details about the http handlers and interaction with xapi's database|
|v6||Add details about the framing of the data within the VDI|
|v7||Redesign semantics of the rrd_updates handler|
|v8||Redesign semantics of the rrd_updates handler (again)|
|v9||Magic number change in framing format of vdi|
|v10||Add details of new APIs added to xapi and xcp-rrdd|
|v11||Remove unneeded API calls|
Xapi has RRDs to track VM- and host-level metrics. There is a desire to have SR-level RRDs as a new category, because SR stats are not specific to a certain VM or host. Examples are size and free space on the SR. While recording SR metrics is relatively straightforward within the current RRD system, the main question is where to archive them, which is what this design aims to address.
All SR types, including the existing ones, should be able to have RRDs defined for them. Some RRDs, such as a “free space” one, may make sense for multiple (if not all) SR types. However, the way to measure something like free space will be SR specific. Furthermore, it should be possible for each type of SR to have its own specialised RRDs.
It follows that each SR will need its own
xcp-rrdd plugin, which runs on the SR master and defines and collects the stats. For the new thin-lvhd SR this could be
xenvmd itself. The plugin registers itself with
xcp-rrdd, so that the latter records the live stats from the plugin into RRDs.
SR-level RRDs will be archived in the SR itself, in a VDI, rather than in the local filesystem of the SR master. This way, we don’t need to worry about master failover.
The VDI will be 4MB in size. This is a little more space than we would need for the RRDs we have in mind at the moment, but will give us enough headroom for the foreseeable future. It will not have a filesystem on it for simplicity and performance. There will only be one RRD archive file for each SR (possibly containing data for multiple metrics), which is gzipped by
xcp-rrdd, and can be copied onto the VDI.
There will be a simple framing format for the data on the VDI. This will be as follows:
|0||32 bit network-order int||magic||Magic number = 0x7ada7ada|
|4||32 bit network-order int||version||1|
|8||32 bit network-order int||length||length of payload|
Xapi will be in charge of the lifecycle of this VDI, not the plugin or
xcp-rrdd, which will make it a little easier to manage them. Only xapi will attach/detach and read from/write to this VDI. We will keep
xcp-rrdd as simple as possible, and have it archive to its standard path in the local file system. Xapi will then copy the RRDs in and out of the VDI.
A new value
"rrd" in the
vdi_type enum of the datamodel will be defined, and the
VDI.type of the VDI will be set to that value. The storage backend will write the VDI type to the LVM metadata of the VDI, so that xapi can discover the VDI containing the SR-level RRDs when attaching an SR to a new pool. This means that SR-level RRDs are currently restricted to LVM SRs.
Because we will not write plugins for all SRs at once, and therefore do not need xapi to set up the VDI for all SRs, we will add an SR “capability” for the backends to be able to tell xapi whether it has the ability to record stats and will need storage for them. The capability name will be:
The SR-stats VDI will be attached/detached on
unplug on the SR master.
PBD.plugon the SR master, if the SR has the stats capability, xapi:
xcp-rrddwhere to put them).
xcp-rrddabout the RRDs so that it will load the RRDs and add newly recorded data to them (needs a function like
push_rrd_localfor VM-level RRDs).
PBD.unplugon the SR master, if the SR has the stats capability xapi:
xcp-rrddto archive the RRDs for the SR, which it will do to the local filesystem.
Xapi’s periodic scheduler regularly triggers
xcp-rrdd to archive the host and VM RRDs. It will need to do this for the SR ones as well. Furthermore, xapi will need to attach the stats VDI and copy the RRD archives into it (as on
There will be a new handler for downloading an SR RRD:
http://<server>/sr_rrd?session_id=<SESSION HANDLE>&uuid=<SR UUID>
RRD updates are handled via a single handler for the host, VM and SR UUIDs
RRD updates for the host, VMs and SRs are handled by a a single handler at
/rrd_updates. Exactly what is returned will be determined by the parameters
passed to this handler.
Whether the host RRD updates are returned is governed by the presence of
host=true in the parameters.
host=<anything else> or the absence of the
host key will mean the host RRD is not returned.
Whether the VM RRD updates are returned is governed by the
vm_uuid key in the
vm_uuid=all will return RRD updates for all VM RRDs.
vm_uuid=xxx will return the RRD updates for the VM with uuid
none (or any other string which is not a valid VM UUID) then
the handler will return no VM RRD updates. If the
vm_uuid key is absent, RRD
updates for all VMs will be returned.
Whether the SR RRD updates are returned is governed by the
sr_uuid key in the
sr_uuid=all will return RRD updates for all SR RRDs.
sr_uuid=xxx will return the RRD updates for the SR with uuid
none (or any other string which is not a valid SR UUID) then
the handler will return no SR RRD updates. If the
sr_uuid key is absent, no
SR RRD updates will be returned.
It will be possible to mix and match these parameters; for example to return RRD updates for the host and all VMs, the URL to use would be:
Or, to return RRD updates for all SRs but nothing else, the URL to use would be:
While behaviour is defined if any of the keys
missing, this is for backwards compatibility and it is recommended that clients
specify each parameter explicitly.
If the SR is presenting a data source called ‘physical_utilisation’, xapi will record this periodically in its database. In order to do this, xapi will fork a thread that, every n minutes (2 suggested, but open to suggestions here), will query the attached SRs, then query RRDD for the latest data source for these, and update the database.
The utilisation of VDIs will not be updated in this way until scalability worries for RRDs are addressed.
Xapi will cache whether it is SR master for every attached SR and only attempt to update if it is the SR master.
Get the filesystem location where sr rrds are archived:
val sr_rrds_path : uid:string -> string
Archive the sr rrds to the filesystem:
val archive_sr_rrd : sr_uuid:string -> unit
Load the sr rrds from the filesystem:
val push_sr_rrd : sr_uuid:string -> unit