xc_vcpu_setaffinity()
Introduction
In the Xen hypervisor, each vCPU has:
- A soft affinity, This is the list of pCPUs where a vCPU prefers to run: - This can be used in cases to make vCPUs prefer to run on a set on pCPUs, for example the pCPUs of a NUMA node, but in case those are already busy, the Credit schedule can still ignore the soft-affinity. A typical use case for this are NUMA machines, where the soft affinity for the vCPUs of a domain should be set equal to the pCPUs of the NUMA node where the domain’s memory shall be placed. - See the description of the NUMA feature for more details. 
- A hard affinity, also known as pinning. This is the list of pCPUs where a vCPU is allowed to run - Hard affinity is currently not used for NUMA placement, but can be configured manually for a given domain, either using - xe VCPUs-params:mask=or the API.- For example, the vCPU’s pinning can be configured using a template with: - xe template-param-set uuid=<template_uuid> vCPUs-params:mask=1,2,3- There are also host-level - guest_VCPUs_paramswhich are used by- host-cpu-tuneto exclusively pin Dom0 and guests (i.e. that their pCPUs never overlap). Note: This isn’t currently supported by the NUMA code: It could result that the NUMA placement picks a node that has reduced capacity or unavailable due to the host mask that- host-cpu-tunehas set.
Purpose
The libxenctrl library call xc_set_vcpu_affinity()
controls the pCPU affinity of the given vCPU.
xenguest
uses it when building domains if
xenopsd
added vCPU affinity information to the XenStore platform data path
platform/vcpu/#domid/affinity of the domain.
Updating the NUMA node affinity of a domain
Besides that, xc_set_vcpu_affinity() can also modify the NUMA node
affinity of the Xen domain if the vCPU:
When Xen creates a domain, it enables the domain’s d->auto_node_affinity
feature flag.
When it is enabled, setting the vCPU affinity also updates the NUMA node affinity which is used for memory allocations for the domain:
Simplified flowchart
flowchart TD
subgraph libxenctrl
    xc_vcpu_setaffinity("<tt>xc_vcpu_setaffinity()")--hypercall-->xen
end
subgraph xen[Xen Hypervisor]
direction LR
vcpu_set_affinity("<tt>vcpu_set_affinity()</tt><br>set the vCPU affinity")
    -->check_auto_node{"Is the domain's<br><tt>auto_node_affinity</tt><br>enabled?"}
        --"yes<br>(default)"-->
            auto_node_affinity("Set the<br>domain's<br><tt>node_affinity</tt>
            mask as well<br>(used for further<br>NUMA memory<br>allocation)")
click xc_vcpu_setaffinity
"https://github.com/xen-project/xen/blob/7cf16387/tools/libs/ctrl/xc_domain.c#L199-L250" _blank
click vcpu_set_affinity
"https://github.com/xen-project/xen/blob/7cf16387/xen/common/sched/core.c#L1353-L1393" _blank
click domain_update_node_aff
"https://github.com/xen-project/xen/blob/7cf16387/xen/common/sched/core.c#L1809-L1876" _blank
click check_auto_node
"https://github.com/xen-project/xen/blob/7cf16387/xen/common/sched/core.c#L1840-L1870" _blank
click auto_node_affinity
"https://github.com/xen-project/xen/blob/7cf16387/xen/common/sched/core.c#L1867-L1869" _blank
endCurrent use by xenopsd and xenguest
When Host.numa_affinity_policy is set to
best_effort,
xenopsd attempts NUMA node placement
when building new VMs and instructs
xenguest
to set the vCPU affinity of the domain.
With the domain’s auto_node_affinity flag enabled by default in Xen,
this automatically also sets the d->node_affinity mask of the domain.
This then causes the Xen memory allocator to prefer the NUMA nodes in the
d->node_affinity NUMA node mask when allocating memory.
That is, (for completeness) unless Xen’s allocation function
alloc_heap_pages() receives a specific NUMA node in its memflags
argument when called.
See xc_domain_node_setaffinity() for more
information about another way to set the node_affinity NUMA node mask
of Xen domains and more depth on how it is used in Xen.
Flowchart of its current use for NUMA affinity
In the flowchart, two code paths are set in bold:
- Show the path when Host.numa_affinity_policyis the default (off) inxenopsd.
- Show the default path of xc_vcpu_setaffinity(XEN_VCPUAFFINITY_SOFT)in Xen, when the Domain’sauto_node_affinityflag is enabled (default) to show how it changes to the vCPU affinity update the domain’snode_affinityin this default case as well.
xenguest uses the Xenstore to read the static domain configuration that it needs reads to build the domain.
flowchart TD
subgraph VM.create["xenopsd VM.create"]
    %% Is xe vCPU-params:mask= set? If yes, write to Xenstore:
    is_xe_vCPUparams_mask_set?{"
            Is
            <tt>xe vCPU-params:mask=</tt>
            set? Example: <tt>1,2,3</tt>
            (Is used to enable vCPU<br>hard-affinity)
        "} --"yes"--> set_hard_affinity("Write hard-affinity to XenStore:
                        <tt>platform/vcpu/#domid/affinity</tt>
                        (xenguest will read this and other configuration data
                         from Xenstore)")
end
subgraph VM.build["xenopsd VM.build"]
    %% Labels of the decision nodes
    is_Host.numa_affinity_policy_set?{
        Is<p><tt>Host.numa_affinity_policy</tt><p>set?}
    has_hard_affinity?{
        Is hard-affinity configured in <p><tt>platform/vcpu/#domid/affinity</tt>?}
    %% Connections from VM.create:
    set_hard_affinity --> is_Host.numa_affinity_policy_set?
    is_xe_vCPUparams_mask_set? == "no"==> is_Host.numa_affinity_policy_set?
    %% The Subgraph itself:
    %% Check Host.numa_affinity_policy
    is_Host.numa_affinity_policy_set?
        %% If Host.numa_affinity_policy is "best_effort":
        -- Host.numa_affinity_policy is<p><tt>best_effort -->
            %% If has_hard_affinity is set, skip numa_placement:
            has_hard_affinity?
                --"yes"-->exec_xenguest
            %% If has_hard_affinity is not set, run numa_placement:
            has_hard_affinity?
                --"no"-->numa_placement-->exec_xenguest
        %% If Host.numa_affinity_policy is off (default, for now),
        %% skip NUMA placement:
        is_Host.numa_affinity_policy_set?
            =="default: disabled"==>
            exec_xenguest
end
%% xenguest subgraph
subgraph xenguest
    exec_xenguest
        ==> stub_xc_hvm_build("<tt>stub_xc_hvm_build()")
            ==> configure_vcpus("<tT>configure_vcpus()")
                %% Decision
                ==> set_hard_affinity?{"
                        Is <tt>platform/<br>vcpu/#domid/affinity</tt>
                        set?"}
end
%% do_domctl Hypercalls
numa_placement
    --Set the NUMA placement using soft-affinity-->
    XEN_VCPUAFFINITY_SOFT("<tt>xc_vcpu_setaffinity(SOFT)")
        ==> do_domctl
set_hard_affinity?
    --yes-->
    XEN_VCPUAFFINITY_HARD("<tt>xc_vcpu_setaffinity(HARD)")
        --> do_domctl
xc_domain_node_setaffinity("<tt>xc_domain_node_setaffinity()</tt>
                            and
                            <tt>xc_domain_node_getaffinity()")
                                <--> do_domctl
%% Xen subgraph
subgraph xen[Xen Hypervisor]
    subgraph domain_update_node_affinity["domain_update_node_affinity()"]
        domain_update_node_aff("<tt>domain_update_node_aff()")
        ==> check_auto_node{"Is domain's<br><tt>auto_node_affinity</tt><br>enabled?"}
          =="yes (default)"==>set_node_affinity_from_vcpu_affinities("
            Calculate the domain's <tt>node_affinity</tt> mask from vCPU affinity
            (used for further NUMA memory allocation for the domain)")
    end
    do_domctl{"do_domctl()<br>op->cmd=?"}
        ==XEN_DOMCTL_setvcpuaffinity==>
            vcpu_set_affinity("<tt>vcpu_set_affinity()</tt><br>set the vCPU affinity")
                ==>domain_update_node_aff
    do_domctl
        --XEN_DOMCTL_setnodeaffinity (not used currently)
            -->is_new_affinity_all_nodes?
    subgraph  domain_set_node_affinity["domain_set_node_affinity()"]
        is_new_affinity_all_nodes?{new_affinity<br>is #34;all#34;?}
            --is #34;all#34;
                --> enable_auto_node_affinity("<tt>auto_node_affinity=1")
                    --> domain_update_node_aff
        is_new_affinity_all_nodes?
            --not #34;all#34;
                --> disable_auto_node_affinity("<tt>auto_node_affinity=0")
                    --> domain_update_node_aff
    end
%% setting and getting the struct domain's node_affinity:
disable_auto_node_affinity
    --node_affinity=new_affinity-->
        domain_node_affinity
set_node_affinity_from_vcpu_affinities
    ==> domain_node_affinity@{ shape: bow-rect,label: "domain: node_affinity" }
        --XEN_DOMCTL_getnodeaffinity--> do_domctl
end
click is_Host.numa_affinity_policy_set?
"https://github.com/xapi-project/xen-api/blob/90ef043c1f3a3bc20f1c5d3ccaaf6affadc07983/ocaml/xenopsd/xc/domain.ml#L951-L962"
click numa_placement
"https://github.com/xapi-project/xen-api/blob/90ef043c/ocaml/xenopsd/xc/domain.ml#L862-L897"
click stub_xc_hvm_build
"https://github.com/xenserver/xen.pg/blob/65c0438b/patches/xenguest.patch#L2329-L2436" _blank
click get_flags
"https://github.com/xenserver/xen.pg/blob/65c0438b/patches/xenguest.patch#L1164-L1288" _blank
click do_domctl
"https://github.com/xen-project/xen/blob/7cf163879/xen/common/domctl.c#L282-L894" _blank
click domain_set_node_affinity
"https://github.com/xen-project/xen/blob/7cf163879/xen/common/domain.c#L943-L970" _blank
click configure_vcpus
"https://github.com/xenserver/xen.pg/blob/65c0438b/patches/xenguest.patch#L1297-L1348" _blank
click set_hard_affinity?
"https://github.com/xenserver/xen.pg/blob/65c0438b/patches/xenguest.patch#L1305-L1326" _blank
click xc_vcpu_setaffinity
"https://github.com/xen-project/xen/blob/7cf16387/tools/libs/ctrl/xc_domain.c#L199-L250" _blank
click vcpu_set_affinity
"https://github.com/xen-project/xen/blob/7cf16387/xen/common/sched/core.c#L1353-L1393" _blank
click domain_update_node_aff
"https://github.com/xen-project/xen/blob/7cf16387/xen/common/sched/core.c#L1809-L1876" _blank
click check_auto_node
"https://github.com/xen-project/xen/blob/7cf16387/xen/common/sched/core.c#L1840-L1870" _blank
click set_node_affinity_from_vcpu_affinities
"https://github.com/xen-project/xen/blob/7cf16387/xen/common/sched/core.c#L1867-L1869" _blank