Abstract:
An example method of scheduling a workload in a virtualized computing system including a host cluster having a virtualization layer directly executing on hardware platforms of hosts is described. The virtualization layer supports execution of virtual machines (VMs) and is integrated with an orchestration control plane. The method includes: receiving, at the orchestration control plane, a workload specification for the workload; selecting, at the orchestration control plane, a plurality of nodes for the workload based on the workload specification, each of the plurality of nodes implemented by a host of the hosts; selecting, by the orchestration control plane in cooperation with a virtualization management server managing the host cluster, a node of the plurality of nodes; and deploying, by the orchestration control plane in cooperation with the virtualization management server, the workload on a host in the host cluster implementing the selected node.
Abstract:
Memory performance in a computer system that implements large page mapping is improved even when memory is scarce by identifying page sharing opportunities within the large pages at the granularity of small pages and breaking up the large pages so that small pages within the large page can be freed up through page sharing. In addition, the number of small page sharing opportunities within the large pages can be used to estimate the total amount of memory that could be reclaimed through page sharing.
Abstract:
A hypervisor exchange, e.g., an upgrade, can include consolidating resident virtual machines into a single host virtual machine, exchanging an old hypervisor with a new (upgraded) hypervisor, and disassociating the virtual resident virtual machines by migrating them to the new hypervisor. The consolidating can involve migrating the resident virtual machines from the old hypervisor to a guest hypervisor on the host virtual machine. The exchange can involve: 1) suspending the host virtual machine before the exchange; and 2) resuming the host virtual machine after the exchange; or migrating the host virtual machine from a partition including the old hypervisor to a partition hosting the new hypervisor. Either way, an exchange (upgrade) is achieve without requiring a bandwidth consuming migration over a network to a standby machine.
Abstract:
A first hypervisor uses a first version of a virtual-memory file system (VMemFS) suspends virtual machines. A second hypervisor uses a instance of the VMemFS, the version of which may be the same or different from the first version. The VMemFS is designed so that an instance of the same or a later version of the VMemFS can read and ingest information in memory written to memory by another instance of the VMemFS. Accordingly, the second hypervisor resumes the virtual machines, effecting an update or other swap of hypervisors with minimal interruption. In other examples, the swapped hypervisors support process containers or simply support virtual memory.
Abstract:
Updates to nonvolatile memory pages are mirrored so that certain features of a computer system, such as live migration of applications, fault tolerance, and high availability, will be available even when nonvolatile memory is local to the computer system. Mirroring may be carried out when a cache flush instruction is executed to flush contents of the cache into nonvolatile memory. In addition, mirroring may be carried out asynchronously with respect to execution of the cache flush instruction by retrieving content that is to be mirrored from the nonvolatile memory using memory addresses of the nonvolatile memory corresponding to target memory addresses of the cache flush instruction.
Abstract:
In a system having non-uniform memory access architecture, with a plurality of nodes, memory access by entities such as virtual CPUs is estimated by invalidating a selected sub-set of memory units, and then detecting and compiling access statistics, for example by counting the page faults that arise when any virtual CPU accesses an invalidated memory unit. The entities, or pairs of entities, may then be migrated or otherwise co-located on the node for which they have greatest memory locality.
Abstract:
In a system having non-uniform memory access architecture, with a plurality of nodes, memory access by entities such as virtual CPUs is estimated by invalidating a selected sub-set of memory units, and then detecting and compiling access statistics, for example by counting the page faults that arise when any virtual CPU accesses an invalidated memory unit. The entities, or pairs of entities, may then be migrated or otherwise co-located on the node for which they have greatest memory locality.
Abstract:
Miss rate curves are constructed in a resource-efficient manner so that they can be constructed and memory management decisions can be made while the workloads are running. The resource-efficient technique includes the steps of selecting a subset of memory pages for the workload, maintaining a least recently used (LRU) data structure for the selected memory pages, detecting accesses to the selected memory pages and updating the LRU data structure in response to the detected accesses, and generating data for constructing a miss-rate curve for the workload using the LRU data structure. After a memory page is accessed, the memory page may be left untraced for a period of time, after which the memory page is retraced.
Abstract:
A memory hierarchy includes a first memory and a second memory that is at a lower position in the memory hierarchy than the first memory. A method of managing the memory hierarchy includes: observing, over a first period of time, accesses to pages of the first memory; in response to determining that no page in a first group of pages was accessed during the first period of time, moving each page in the first group of pages from the first memory to the second memory; and in response to determining that the number of pages in other groups of pages of the first memory, which were accessed during the first period of time, is less than a threshold number of pages, moving each page in the other group of pages, that was not accessed during the first period of time from the first memory to the second memory.
Abstract:
Disclosed are various embodiments for improving resiliency and performance of clustered memory. A computing device can acquire a chunk of byte-addressable memory from a cluster memory host. The computing device can then identify an active set of allocated memory pages and an inactive set of allocated memory pages for a process executing on the computing device. Next, the computing device can store the active set of allocated memory pages for the process in the memory of the computing device. Finally, the computing device can store the inactive set of allocated memory pages for the process in the chunk of byte-addressable memory of the cluster memory host.