Design document | |
---|---|
Revision | v7 |
Status | released (7.0) |
Review | create new issue |
Revision history | |
v1 | Initial version |
v2 | Add details about VM migration and import |
v3 | Included and excluded use cases |
v4 | Rolling Pool Upgrade use cases |
v5 | Lots of changes to simplify the design |
v6 | Use case refresh based on simplified design |
v7 | RPU refresh based on simplified design |
The old XS 5.6-style Heterogeneous Pool feature that is based around hardware-level CPUID masking will be replaced by a safer and more flexible software-based levelling mechanism.
A VM can only be migrated safely from one host to another if both hosts offer the set of CPU features which the VM expects. If this is not the case, CPU features may appear or disappear as the VM is migrated, causing it to crash. The purpose of feature levelling is to hide features which the hosts do not have in common from the VM, so that it does not see any change in CPU capabilities when it is migrated.
Most pools start off with homogenous hardware, but over time it may become impossible to source new hosts with the same specifications as the ones already in the pool. The main use of feature levelling is to allow such newer, more capable hosts to be added to an existing pool while preserving the ability to migrate existing VMs to any host in the pool.
The CPU levelling feature aims to both:
To make migrations safe:
Note: Due to the limitations of the old Heterogeneous Pools feature, we are not able to guarantee the safety of VMs that are migrated to a Levelling-v2 host from an older host, during a rolling pool upgrade. This is because such VMs may be using CPU features that were not captured in the old feature sets, of which we are therefore unaware. However, migrations between the same two hosts, but before the upgrade, may have already been unsafe. The promise is that we will not make migrations more unsafe during a rolling pool upgrade.
To make VMs mobile:
A user wants to add a new host to an existing XenServer pool. The new host has all the features of the existing hosts, plus extra features which the existing hosts do not. The new host will be allowed to join the pool, but its extra features will be hidden from VMs that are started on the host or migrated to it. The join does not require any host reboots.
A user wants to add a new host to an existing XenServer pool. The new host does not have all the features of the existing ones. XenCenter warns the user that adding the host to the pool is possible, but it would lower the pool’s CPU feature level. The user accepts this and continues the join. The join does not require any host reboots. VMs that are started anywhere on the pool, from now on, will only see the features of the new host (the lowest common denominator), such that they are migratable to any host in the pool, including the new one. VMs that were running before the pool join will not be migratable to the new host, because these VMs may be using features that the new host does not have. However, after a reboot, such VMs will be fully mobile.
A user wants to add a new host to an existing XenServer pool. The new host does not have all the features of the existing ones, and at the same time, it has certain features that the pool does not have (the feature sets overlap). This is essentially a combination of the two use cases above, where the pool’s CPU feature level will be downgraded to the intersection of the feature sets of the pool and the new host. The join does not require any host reboots.
A user wants to upgrade or repair the hardware of a host in an existing XenServer pool. After upgrade the host has all the features it used to have, plus extra features which other hosts in the pool do not have. The extra features are masked out and the host resumes its place in the pool when it is booted up again.
A user wants to upgrade or repair the hardware of a host in an existing XenServer pool. After upgrade the host has fewer features than it used to have. When the host is booted up again, the pool CPU’s feature level will be automatically lowered, and the user will be alerted of this fact (through the usual alerting mechanism).
A user wants to remove a host from an existing XenServer pool. The host will be removed as normal after any VMs on it have been migrated away. The feature set offered by the pool will be automatically re-levelled upwards in case the host which was removed was the least capable in the pool, and additional features common to the remaining hosts will be unmasked.
A VM which was running on the pool before the upgrade is expected to continue to run afterwards. However, when the VM is migrated to an upgraded host, some of the CPU features it had been using might disappear, either because they are not offered by the host or because the new feature-levelling mechanism hides them. To have the best chance for such a VM to successfully migrate (see the note under “Principles for Migration”), it will be given a temporary VM-level feature set providing all of the destination’s CPU features that were unknown to XenServer before the upgrade. When the VM is rebooted it will inherit the pool-level feature set.
A VM which is started during the upgrade will be given the current pool-level feature set. The pool-level feature set may drop after the VM is started, as more hosts are upgraded and re-join the pool, however the VM is guaranteed to be able to migrate to any host which has already been upgraded. If the VM is started on the master, there is a risk that it may only be able to run on that host.
To allow the VMs with grandfathered-in flags to be migrated around in the pool, the intra pool VM migration pre-checks will compare the VM’s feature flags to the target host’s flags, not the pool flags. This will maximise the chance that a VM can be migrated somewhere in a heterogeneous pool, particularly in the case where only a few hosts in the pool do not have features which the VMs require.
To allow cross-pool migration, including to pool of a higher XenServer version, we will still check the VM’s requirements against the pool-level features of the target pool. This is to avoid the possibility that we migrate a VM to an ‘island’ in the other pool, from which it cannot be migrated any further.
host.cpu_info
is a field of type (string -> string) map
that contains information about the CPUs in a host. It contains the following keys: cpu_count
, socket_count
, vendor
, speed
, modelname
, family
, model
, stepping
, flags
, features
, features_after_reboot
, physical_features
and maskable
.
features_after_reboot
, physical_features
and maskable
.features
key will continue to hold the current CPU features that the host is able to use. In practise, these features will be available to Xen itself and dom0; guests may only see a subset. The current format is a string of four 32-bit words represented as four groups of 8 hexadecimal digits, separated by dashes. This will change to an arbitrary number of 32-bit words. Each bit at a particular position (starting from the left) still refers to a distinct CPU feature (1
: feature is present; 0
: feature is absent), and feature strings may be compared between hosts. The old format simply becomes a special (4 word) case of the new format, and bits in the same position may be compared between old and new feature strings.features_pv
will be added, representing the subset of features
that the host is able to offer to a PV guest.features_hvm
will be added, representing the subset of features
that the host is able to offer to an HVM guest.pool.cpu_info
of type (string -> string) map
(read only) will be added. It will contain:
vendor
: The common CPU vendor across all hosts in the pool.features_pv
: The intersection of features_pv
across all hosts in the pool, representing the feature set that a PV guest will see when started on the pool.features_hvm
: The intersection of features_hvm
across all hosts in the pool, representing the feature set that an HVM guest will see when started on the pool.cpu_count
: the total number of CPU cores in the pool.socket_count
: the total number of CPU sockets in the pool.pool.other_config:cpuid_feature_mask
override key will no longer have any effect on pool join or VM migration.VM.last_boot_CPU_flags
will be updated to the new format (see host.cpu_info:features
). It will still contain the feature set that the VM was started with as well as the vendor (under the features
and vendor
keys respectively).pool.join
currently requires that the CPU vendor and feature set (according to host.cpu_info:vendor
and host.cpu_info:features
) of the joining host are equal to those of the pool master. This requirement will be loosened to mandate only equality in CPU vendor:
host.cpu_info:vendor
equals pool.cpu_info:vendor
.POOL_HOSTS_NOT_HOMOGENEOUS
with reason
argument "CPUs differ"
. This will remain the error that is raised if the pool join fails due to incompatible CPU vendors.pool.other_config:cpuid_feature_mask
override key will no longer have any effect.host.set_cpu_features
and host.reset_cpu_features
will be removed: it is no longer to use the old method of CPU feature masking (CPU feature sets are controlled automatically by xapi). Calls will fail with MESSAGE_REMOVED
.host.cpu_info:vendor
= VM.last_boot_CPU_flags:vendor
and host.cpu_info:features_{pv,hvm}
⊇ VM.last_boot_CPU_flags:features
. A VM_INCOMPATIBLE_WITH_THIS_HOST
error will be returned otherwise (as happens today).pool.cpu_info:vendor
= VM.last_boot_CPU_flags:vendor
and pool.cpu_info:features_{pv,hvm}
⊇ VM.last_boot_CPU_flags:features
The following changes to the xe
CLI will be made:
xe host-cpu-info
(as well as xe host-param-list
and friends) will return the fields of host.cpu_info
as described above.xe host-set-cpu-features
and xe host-reset-cpu-features
will be removed.xe host-get-cpu-features
will still return the value of host.cpu_info:features
for a given host.The old xc_get_boot_cpufeatures
hypercall will be removed, and replaced by two new functions, which are available to xenopsd through the Xenctrl module:
external get_levelling_caps : handle -> int64 = "stub_xc_get_levelling_caps"
type featureset_index = Featureset_host | Featureset_pv | Featureset_hvm
external get_featureset : handle -> featureset_index -> int64 array = "stub_xc_get_featureset"
In particular, the get_featureset
function will be used by xapi/xenopsd to ask Xen which are the widest sets of CPU features that it can offer to a VM (PV or HVM). I don’t think there is a use for get_levelling_caps
yet.
Host.cpu_info
, which contains all the fields that need to go into the host.cpu_info
field in the xapi DB. The type already exists but is unused. Add the function HOST.get_cpu_info
to obtain an instance of the type. Some code from xapi and the cpuid.ml from xen-api-libs can be reused.featureset
(Vm.t.platformdata
), which xenopsd will write to xenstore along with the other platform keys (no code change needed in xenopsd). Xenguest will pick this up when a domain is created, and will apply the CPUID policy to the domain. This has the effect of masking out features that the host may have, but which have a 0
in the feature set bitmap.xc/domain.ml
.Create_misc.create_host_cpu
function to use the new xenopsd call.pool.cpu_info.features_{pv,hvm}
. Newly started VMs will inherit the new level; already running VMs will not be affected, but will not be able to migrate to this host.pool_cpu_features_downgraded
.pool.cpu_info.features_{pv,hvm}
) and set VM.last_boot_CPU_flags
(cpuid_helpers.ml
).platformdata
(see above).VM.last_boot_CPU_flags
of the VM to-migrate with host.cpu_info
of the receiving host. Migration is only allowed if the CPU vendors and the same, and host.cpu_info:features
⊇ VM.last_boot_CPU_flags:features
. The check can be overridden by setting the force
argument to true
.features_pv
or features_hvm
field.pool.cpu_info
(features_pv
or features_hvm
depending on how the VM was booted) rather than host.cpu_info
.VM.last_boot_CPU_flags
will be maintained, and the new domain will be started with the same CPU feature set enabled, by writing the feature set string to platformdata
(see above).VM.last_boot_CPU_flags
will be extended with the extra bits in host.cpu_info:features_{pv,hvm}
, i.e. the widest feature set that can possibly be granted to the VM (just in case the VM was using any of these features before the migration).xc_get_featureset
hypercall). However, the CPU features that are switched off by the new implementation are features that a VM would not have been able to actually use. We therefore need a don’t-care feature set (similar to the old pool.other_config:cpuid_feature_mask
key) with bits that we may ignore in migration checks, and switch off after the migration. This will be a xapi config file option.The VM.last_boot_CPU_flags
field must be upgraded to the new format (only really needed for VMs that were suspended while exported; preserve_power_state=true
), as described above.
Update pool join checks according to the rules above (see pool.join
), i.e. remove the CPU features constraints.
pool.cpu_info
) will be initialised when the pool master upgrades, and automatically adjusted if needed (downwards) when slaves are upgraded, by each upgraded host’s started sequence (as above under “Xapi startup”).VM.last_boot_CPU_flags
fields of running and suspended VMs will be “upgraded” to the new format on demand, when a VM is migrated to or resume on an upgraded host, as described above.