Abstract:
Techniques for achieving application high availability via application-transparent battery-backed replication of persistent data are provided. In one set of embodiments, a computer system can detect a failure that causes an application of the computer system to stop running. In response to detecting the failure, the computer system can copy persistent data written by the application and maintained locally at the computer system to one or more remote destinations, where the copying is performed in a manner that is transparent to the application and while the computer system runs on battery power. The application can then be restarted on another computer system using the copied data.
Abstract:
Techniques for implementing high availability for persistent memory are provided. In one embodiment, a first computer system can detect an alternating current (AC) power loss/cycle event and, in response to the event, can save data in a persistent memory of the first computer system to a memory or storage device that is remote from the first computer system and is accessible by a second computer system. The first computer system can then generate a signal for the second computer system subsequently to initiating or completing the save process, thereby allowing the second computer system to restore the saved data from the memory or storage device into its own persistent memory.
Abstract:
Techniques for achieving application high availability via crash-consistent asynchronous replication of persistent data are provided. In one set of embodiments, an application running on a computer system can, during runtime of the application: write persistent data to a local nonvolatile data store of the computer system, write one or more log entries comprising the persistent data to a local log region of the computer system, and asynchronously copy the one or more log entries to one or more remote destinations. Then, upon detecting a failure that prevents the application from continuing execution, the computer system can copy the local log region or a remaining portion thereof to the one or more remote destinations, where the copying is performed while the computer system runs on battery power and where the application is restarted on another computer system using a persistent state derived from the copied log entries.
Abstract:
Systems and techniques are described for power-aware scheduling. One of the techniques includes monitoring execution of a plurality of groups of software threads executing on a physical machine, wherein the physical machine comprises a physical hardware platform that includes a plurality of processor packages having a plurality of package power states, wherein the plurality of package power states includes an independent package power state; obtaining a respective independent power state measure for each of the processor packages, wherein the independent power state measure provides a measure of a percentage of time the processor package spends in the independent package power state; and adjusting an allocation of the plurality of groups of software threads across the plurality of processor packages based in part on the independent power state measures for the packages.
Abstract:
A system and method are disclosed for improving operation of a memory scheduler operating on a host machine supporting virtual machines (VMs) in which guest operating systems and guest applications run. For each virtual machine, the host machine hypervisor categorizes memory pages into memory usage classes and estimates the total number of pages for each memory usage class. The memory scheduler uses this information to perform memory reclamation and allocation operations for each virtual machine. The memory scheduler further selects between ballooning reclamation and swapping reclamation operations based in part on the numbers of pages in each memory usage class for the virtual machine. Calls to the guest operating system provide the memory usage class information. Memory reclamation not only can improve the performance of existing VMs, but can also permit the addition of a VM on the host machine without substantially impacting the performance of the existing and new VMs.
Abstract:
The prioritization of large memory page mapping is a function of the access bits in the L1 page table. In a first phase of operation, the number of set access bits in each of the L1 page tables is counted periodically and a current count value is calculated therefrom. During the first phase, no pages are mapped large even if identified as such. After the first phase, the current count value is used to prioritize among potential large memory pages to determine which pages to map large. The system continues to calculate the current count value even after the first phase ends. When using hardware assist, the access bits in the nested page tables are used and when using software MMU, the access bits in the shadow page tables are used for large page prioritization.
Abstract:
Techniques for implementing high availability for persistent memory are provided. In one embodiment, a first computer system can detect an alternating current (AC) power loss/cycle event and, in response to the event, can save data in a persistent memory of the first computer system to a memory or storage device that is remote from the first computer system and is accessible by a second computer system. The first computer system can then generate a signal for the second computer system subsequently to initiating or completing the save process, thereby allowing the second computer system to restore the saved data from the memory or storage device into its own persistent memory.
Abstract:
Techniques for achieving application high availability via application-transparent battery-backed replication of persistent data are provided. In one set of embodiments, a computer system can detect a failure that causes an application of the computer system to stop running. In response to detecting the failure, the computer system can copy persistent data written by the application and maintained locally at the computer system to one or more remote destinations, where the copying is performed in a manner that is transparent to the application and while the computer system runs on battery power. The application can then be restarted on another computer system using the copied data.
Abstract:
Techniques for implementing high availability for persistent memory are provided. In one embodiment, a first computer system can detect an alternating current (AC) power loss/cycle event and, in response to the event, can save data in a persistent memory of the first computer system to a memory or storage device that is remote from the first computer system and is accessible by a second computer system. The first computer system can then generate a signal for the second computer system subsequently to initiating or completing the save process, thereby allowing the second computer system to restore the saved data from the memory or storage device into its own persistent memory.
Abstract:
Techniques for achieving application high availability via application-transparent battery-backed replication of persistent data are provided. In one set of embodiments, a computer system can detect a failure that causes an application of the computer system to stop running. In response to detecting the failure, the computer system can copy persistent data written by the application and maintained locally at the computer system to one or more remote destinations, where the copying is performed in a manner that is transparent to the application and while the computer system runs on battery power. The application can then be restarted on another computer system using the copied data.