Abstract:
A technique is described for managing processor (CPU) resources in a host having virtual machines (VMs) executed thereon. A target size of a VM is determined based on its demand and CPU entitlement. If the VM's current size exceeds the target size, the technique dynamically changes the size of a VM in the host by increasing or decreasing the number of virtual CPUs available to the VM. To “deactivate” virtual CPUs, a high-priority balloon thread is launched and pinned to one of the virtual CPUs targeted for deactivation, and the underlying hypervisor deschedules execution of the virtual CPU accordingly. To “activate” virtual CPUs and increase the number of virtual CPUs available to the VM, the launched balloon thread may be killed.
Abstract:
Systems and techniques are described for power-aware scheduling. One of the techniques includes monitoring execution of a plurality of groups of software threads executing on a physical machine, wherein the physical machine comprises a physical hardware platform that includes a plurality of processor packages having a plurality of package power states, wherein the plurality of package power states includes an independent package power state; obtaining a respective independent power state measure for each of the processor packages, wherein the independent power state measure provides a measure of a percentage of time the processor package spends in the independent package power state; and adjusting an allocation of the plurality of groups of software threads across the plurality of processor packages based in part on the independent power state measures for the packages.
Abstract:
Techniques for optimizing virtual machine (VM) scheduling on a non-uniform cache access (NUCA) system are provided. In one set of embodiments, a hypervisor of the NUCA system can partition the virtual CPUs of each VM running on the system into logical constructs referred to as last level cache (LLC) groups, where each LLC group is sized to match (or at least not exceed) the LLC domain size of the system. The hypervisor can then place/load balance the virtual CPUs of each VM on the system’s cores in a manner that attempts to keep virtual CPUs which are part of the same LLC group within the same LLC domain, subject to various factors such as compute load, cache contention, and so on.
Abstract:
Techniques for concurrently supporting virtual non-uniform memory access (virtual NUMA) and CPU/memory hot-add in a virtual machine (VM) are provided. In one set of embodiments, a hypervisor of a host system can compute a node size for a virtual NUMA topology of the VM, where the node size indicates a maximum number of virtual central processing units (vCPUs) and a maximum amount of memory to be included in each virtual NUMA node. The hypervisor can further build and expose the virtual NUMA topology to the VM. Then, at a time of receiving a request to hot-add a new vCPU or memory region to the VM, the hypervisor can check whether all existing nodes in the virtual NUMA topology have reached the maximum number of vCPUs or maximum amount of memory, per the computed node size. If so, the hypervisor can create a new node with the new vCPU or memory region and add the new node to the virtual NUMA topology.
Abstract:
Disclosed are various embodiments that utilize conflict cost for workload placements in datacenter environments. In some examples, a protected memory level is identified for a computing environment. The computing environment includes a number of processor resources. Incompatible processor workloads are prohibited from concurrently executing on parallel processor resources. Parallel processor resources share memory at the protected memory level. A number of conflict costs are determined for a processor workload. Each conflict cost is determined based on a measure of compatibility between the processor workload and a parallel processor resource that shares a particular memory with the respective processor resource. The processor workload is assigned to execute on a processor resource associated with a minimum conflict cost.
Abstract:
Disclosed are various embodiments for distributing the load of a plurality of virtual machines across a plurality of hosts. A potential new host for a virtual machine executing on a current host is identified. A gain rate associated with migration of the virtual machine from the current host to the potential new host is calculated. A gain duration associated with migration of the virtual machine from the current host to the potential new host is also calculated. A migration cost for migration of the virtual machine from the current host to the potential new host, the migration cost being based on the gain rate and the gain duration is determined. It is then determined whether the migration cost is below a predefined threshold cost. Migration of the virtual machine from the current host to the optimal host is initiated in response to a determination that the migration cost is below the predefined threshold.
Abstract:
Disclosed are various embodiments for distributing the load of a plurality of virtual machines across a plurality of hosts. A first plurality of efficiency ratings for a current host of a virtual machine are calculated. A second plurality of efficiency ratings for a potential new host of the virtual machine are also calculated. The first plurality of efficiency ratings are compared to the second plurality of efficiency ratings to determine that the potential new host for the virtual machine is an optimal host for the virtual machine. Then migration of the virtual machine from the current host to the optimal host is initiated.
Abstract:
Systems and methods for performing selection of non-uniform memory access (NUMA) nodes for mapping of virtual central processing unit (vCPU) operations to physical processors are provided. A CPU scheduler evaluates the latency between various candidate processors and the memory associated with the vCPU, and the size of the working set of the associated memory, and the vCPU scheduler selects an optimal processor for execution of a vCPU based on the expected memory access latency and the characteristics of the vCPU and the processors. The systems and methods further provide for monitoring system characteristics and rescheduling the vCPUs when other placements provide improved performance and efficiency.
Abstract:
A host computer has one or more physical central processing units (CPUs) that support the execution of a plurality of containers, where the containers each include one or more processes. Each process of a container is assigned to execute exclusively on a corresponding physical CPU when the corresponding container is determined to be latency sensitive. The assignment of a process to execute exclusively on a corresponding physical CPU includes the migration of tasks from the corresponding physical CPU to one or more other physical CPUs of the host system, and the directing of task and interrupt processing to the one or more other physical CPUs. Tasks of the process corresponding to the container are then executed on the corresponding physical CPU.
Abstract:
A host computer has one or more physical central processing units (CPUs) that support the execution of a plurality of containers, where the containers each include one or more processes. Each process of a container is assigned to execute exclusively on a corresponding physical CPU when the corresponding container is determined to be latency sensitive. The assignment of a process to execute exclusively on a corresponding physical CPU includes the migration of tasks from the corresponding physical CPU to one or more other physical CPUs of the host system, and the directing of task and interrupt processing to the one or more other physical CPUs. Tasks of the process corresponding to the container are then executed on the corresponding physical CPU.