Abstract:
Instances of a same application execute on different respective hosts in a cloud computing environment. Instances of a monitor application are distributed to concurrently execute with each application instance on a host in the cloud environment, which provides user access to the application instances. The monitor application may be generated from a specification, which may define properties of the application/cloud to monitor and rules based on the properties. Each rule may have one or more conditions. Each monitor instance running on a host, monitors execution of the corresponding application instance on that host by obtaining from the host information regarding values of properties on the host per the application instance. Each monitor instance may evaluate the local host information or aggregate information collected from hosts running other instances of the monitor application, to repeatedly determine whether a rule condition has been violated. On violation, a user-specified handler is triggered.
Abstract:
This patent application relates to an agile network architecture that can be employed in data centers, among others. One implementation provides a virtual layer-2 network connecting machines of a layer-3 infrastructure.
Abstract:
An elastic scaling cloud-hosted batch application system and method that performs automated elastic scaling of the number of compute instances used to process batch applications in a cloud computing environment. The system and method use automated elastic scaling to minimize job completion time and monetary cost of resources. Embodiments of the system and method use a workload-driven approach to estimate a work volume to be performed. This is based on task arrivals and job execution times. Given the work volume estimate, an adaptive controller dynamically adapts the number of compute instances to minimize the cost and completion time. Embodiments of the system and method also mitigate startup delays by computing a work volume in the near future and gradually starting up additional compute instances before they are needed. Embodiments of the system and method also ensure fairness among batch applications and concurrently executing jobs.
Abstract:
A plurality of requests for execution of computing jobs on one or more devices that include a plurality of computing resources may be obtained, the one or more devices configured to flexibly allocate the plurality of computing resources, each of the computing jobs including job completion values representing a worth to a respective user that is associated with execution completion times of each respective computing job. The computing resources may be scheduled based on the job completion values associated with each respective computing job.
Abstract:
A system for managing allocation of resources based on service level agreements between application owners and cloud operators. Under some service level agreements, the cloud operator may have responsibility for managing allocation of resources to the software application and may manage the allocation such that the software application executes within an agreed performance level. Operating a cloud computing platform according to such a service level agreement may alleviate for the application owners the complexities of managing allocation of resources and may provide greater flexibility to cloud operators in managing their cloud computing platforms.
Abstract:
This document describes techniques for dynamically placing computing jobs. These techniques enable reduced financial and/or energy costs to perform computing jobs at data centers.
Abstract:
Techniques are disclosed for allocation of resources in a distributed computing system. For example, a method for allocating a set of one or more components of an application to a set of one or more resource groups includes the following steps performed by a computer system. The set of one or more resource groups is ordered based on respective failure measures and resource capacities associated with the one or more resource groups. An importance value is assigned to each of the one or more components, wherein the importance value is associated with an affect of the component on an output of the application. The one or more components are assigned to the one or more resource groups based on the importance value of each component and the respective failure measures and resource capacities associated with the one or more resource groups, wherein components with higher importance values are assigned to resource groups with lower failure measures and higher resource capacities. The application may be a partial fault tolerant (PFT) application that comprises a set of one or more PFT application components. The set of one or more resource groups may comprise a heterogeneous set of resource groups (or clusters).
Abstract:
The described implementations relate to energy-aware server management. One implementation involves an adaptive control unit configured to manage energy usage in a server farm by transitioning individual servers between active and inactive states while maintaining response times for the server farm at a predefined level.
Abstract:
Techniques are disclosed for allocation of resources in a distributed computing system. For example, a method for allocating a set of one or more components of an application to a set of one or more resource groups includes the following steps performed by a computer system. The set of one or more resource groups is ordered based on respective failure measures and resource capacities associated with the one or more resource groups. An importance value is assigned to each of the one or more components, wherein the importance value is associated with an affect of the component on an output of the application. The one or more components are assigned to the one or more resource groups based on the importance value of each component and the respective failure measures and resource capacities associated with the one or more resource groups, wherein components with higher importance values are assigned to resource groups with lower failure measures and higher resource capacities. The application may be a partial fault tolerant (PFT) application that comprises a set of one or more PFT application components. The set of one or more resource groups may comprise a heterogeneous set of resource groups (or clusters).
Abstract:
A method for adaptively allocating resources to a plurality of jobs. The method comprises selecting a first policy from a plurality of policies for a first job in the plurality of jobs by using a policy selection mechanism, allocating at least one resource to the first job in accordance with the first policy, and in response to completion of the first job, updating the policy selection mechanism to obtain an updated policy selection mechanism by using at least one processor. Updating the policy selection mechanism comprises evaluating the performance of the first policy with respect to the first job by calculating a value of a metric of utility for the first policy based on conditions associated with execution of the first job and updating the policy selection mechanism based on the calculated value and a delay of execution of the first job.