Abstract:
To generate a checkpoint for a virtual machine (VM), first, while the VM is still running, a copy-on-write (COW) disk file is created pointing to a parent disk file that the VM is using. Next, the VM is stopped, the VM's memory is marked COW, the device state of the VM is saved to memory, the VM is switched to use the COW disk file, and the VM begins running again for substantially the remainder of the checkpoint generation. Next, the device state that was stored in memory and the unmodified VM memory pages are saved to a checkpoint file. Also, a copy may be made of the parent disk file for retention as part of the checkpoint, or the original parent disk file may be retained as part of the checkpoint. If a copy of the parent disk file was made, then the COW disk file may be committed to the original parent disk file.
Abstract:
A method and tangible medium embodying code for allocating resource units of an allocatable resource among a plurality of clients in a computer is described. In the method, resource units are initially distributed among the clients by assigning to each of the clients a nominal share of the allocatable resource. For each client, a current allocation of resource units is determined. A metric is evaluated for each client, the metric being a function both of the nominal share and a usage-based factor, the usage-based factor being a function of a measure of resource units that the client is actively using and a measure of resource units that the client is not actively using. A resource unit can be reclaimed from a client when the metric for that client meets a predetermined criterion.
Abstract:
Miss rate curves are constructed in a resource-efficient manner so that they can be constructed and memory management decisions can be made while the workloads are running. The resource-efficient technique includes the steps of selecting a subset of memory pages for the workload, maintaining a least recently used (LRU) data structure for the selected memory pages, detecting accesses to the selected memory pages and updating the LRU data structure in response to the detected accesses, and generating data for constructing a miss-rate curve for the workload using the LRU data structure. After a memory page is accessed, the memory page may be left untraced for a period of time, after which the memory page is retraced.
Abstract:
An anomaly in a shared input/output (IO) resource that is accessed by a plurality hosts or clients is detected when a host that is not bound by any QoS policy presents large workloads to a shared IO resource that is also accessed by hosts or clients that are governed by QoS policy. The anomaly detection triggers a response from the hosts or clients as a way to protect against the effect of the anomaly. The response is an increase in window sizes. The window sizes of the hosts or clients may be increased to the maximum window size or in proportion to their QoS shares.
Abstract:
A method is provided for recovering from an uncorrected memory error located at a memory address as identified by a memory device. A stored hash value for a memory page corresponding to the identified memory address is used to determine the correct data. Because the memory device specifies the location of the corrupted data, and the size of the window where the corruption occurred, the stored hash can be used to verify memory page reconstruction. With the known good part of the data in hand, the hashes of the pages using possible values in place of the corrupted data are calculated. It is expected that there will be a match between the previously stored hash and one of the computed hashes. As long as there is one and only one match, then that value, used in the place of the corrupted data, is the correct value. The corrupt data, once replaced, allows operation of the memory device to continue without needing to interrupt or otherwise affect a system's operation.
Abstract:
A virtual-machine-based system that identifies an application or process in a virtual machine in order to locate resources associated with the identified application. Access to the located resources is then controlled based on a context of the identified application. Those applications without the necessary context will have a different view of the resource.
Abstract:
To generate a checkpoint for a virtual machine (VM), first, while the VM is still running, a copy-on-write (COW) disk file is created pointing to a parent disk file that the VM is using. Next, the VM is stopped, the VM's memory is marked COW, the device state of the VM is saved to memory, the VM is switched to use the COW disk file, and the VM begins running again for substantially the remainder of the checkpoint generation. Next, the device state that was stored in memory and the unmodified VM memory pages are saved to a checkpoint file. Also, a copy may be made of the parent disk file for retention as part of the checkpoint, or the original parent disk file may be retained as part of the checkpoint. If a copy of the parent disk file was made, then the COW disk file may be committed to the original parent disk file.
Abstract:
A shared input/output (IO) resource is managed in a decentralized manner. Each of multiple hosts having IO access to the shared resource, computes an average latency value that is normalized with respect to average IO request sizes, and stores the computed normalized latency value for later use. The normalized latency values thus computed and stored may be used for a variety of different applications, including enforcing a quality of service (QoS) policy that is applied to the hosts, detecting a condition known as an anomaly where a host that is not bound by a QoS policy accesses the shared resource at a rate that impacts the level of service received by the plurality of hosts that are bound by the QoS policy, and migrating workloads between storage arrays to achieve load balancing across the storage arrays.
Abstract:
A method and tangible medium embodying code for allocating resource units of an allocatable resource among a plurality of clients in a computer is described. In the method, resource units are initially distributed among the clients by assigning to each of the clients a nominal share of the allocatable resource. For each client, a current allocation of resource units is determined. A metric is evaluated for each client, the metric being a function both of the nominal share and a usage-based factor, the usage-based factor being a function of a measure of resource units that the client is actively using and a measure of resource units that the client is not actively using. A resource unit can be reclaimed from a client when the metric for that client meets a predetermined criterion.
Abstract:
A virtual-machine-based system that identifies an application or process in a virtual machine in order to locate resources associated with the identified application. Access to the located resources is then controlled based on a context of the identified application. Those applications without the necessary context will have a different view of the resource.