摘要:
In various embodiments, systems and methods are presented for providing resources by way of a platform as a service in a distributed computing environment to perform a job. The system may be comprised of a number of components, such as a task machine, a task location service machine, and a high-level location service machines that in combination are useable to accomplish functions provided herein. It is contemplated that the system performs methods for providing resources by determining resources of the system, such as virtual machines, and applying auto-scaling rules to the system to scale those resources. Based on the determination of the auto-scaling rules, the resources may be allocated to achieve a desired result.
摘要:
Systems and methods are presented for providing resources by way of a platform as a service in a distributed computing environment to perform a job. Resources of the system, job performing on the system, and schedulers of the jobs performing on the system are decoupled in a manner that allows a job to easily migrate among resources. It is contemplated that the migration of jobs from a first pool of resource to a second pool of resource is performed by the system without human intervention. The migration of a job may utilize different schedulers for the different resources. Further, it is contemplated that a pool of resources may automatically allocate additional or fewer resources in response to a migration of a job.
摘要:
Systems and methods are provided for assigning and associating resources in a cloud computing environment. Virtual machines in the cloud computing environment can be assigned or associated with pools corresponding to users as dedicated, standby, or preemptible machines. The various states provide users with the ability to reserve a desired level of resources while also allowing the operator of the cloud computing environment to increase resource utilization.
摘要:
Systems and methods are provided that enable a general framework for partitioning application-defined jobs in a scalable environment. The general framework decouples partitioning of a job from the other aspects of the job. As a result, the effort required to define the application-defined job is reduced or minimized, as the user is not required to provide a partitioning algorithm. The general framework also facilitates management of masters and servers performing computations within the distributed environment.
摘要:
Systems, methods, and computer storage media for upgrading a domain in a distributed computing environment are provided. Upgrading of the domain includes preparing for the upgrade, upgrading, and finalizing the upgrade. The preparation of the domain includes ensuring predefined quantities of role instances are available in domains other than the upgrade domain. The preparation also includes ensuring that a predefined number of extent replicas are available in domains other than the upgrade domain. The preparation may also include checkpointing partitions within the upgrade domain to facilitate faster loading once transferred to a domain other than the upgrade domain. The finalization may include allowing nodes within the upgrade domain to resume functionality that was suspended during the upgrade.
摘要:
Partition management for a scalable, structured storage system is provided. The storage system provides storage represented by one or more tables, each of which includes rows that represent data entities. A table is partitioned into a number of partitions, each partition including a contiguous range of rows. The partitions are served by table servers and managed by a table master. Load distribution information for the table servers and partitions is tracked, and the table master determines to split and/or merge partitions based on the load distribution information.
摘要:
Systems and methods are provided that enable a general framework for partitioning application-defined computations (e.g., jobs) in a scalable environment. The general framework decouples partitioning of a computation from the other aspects of the computation. As a result, the effort required to define an application-defined job is reduced or minimized, as the user is not required to provide a partitioning algorithm. A user can optionally take advantage of a partitioning framework by providing application-defined interfaces to perform the desired job. Optionally, a user can provide additional information to allow for modification of how partitions are assigned.
摘要:
Systems, methods, and computer storage media for upgrading a domain in a distributed computing environment are provided. Upgrading of the domain includes preparing for the upgrade, upgrading, and finalizing the upgrade. The preparation of the domain includes ensuring predefined quantities of role instances are available in domains other than the upgrade domain. The preparation also includes ensuring that a predefined number of extent replicas are available in domains other than the upgrade domain. The preparation may also include checkpointing partitions within the upgrade domain to facilitate faster loading once transferred to a domain other than the upgrade domain. The finalization may include allowing nodes within the upgrade domain to resume functionality that was suspended during the upgrade.
摘要:
Systems and methods are provided that enable a general framework for partitioning application-defined jobs (e.g., computation) in a scalable environment. The general framework decouples partitioning of a job from the other aspects of the job. As a result, the effort required to define the application-defined computation in a scalable environment is reduced or minimized, as the user is not required to provide a partitioning algorithm. The general framework further allows a user to provide load balancing conditions to allow for modification of how partitions are assigned.
摘要:
Embodiments of the present invention relate to invoking and managing load-balancing operation(s) applied to partitions within a distributed computing environment, where each partition represents a key range of data for a storage account. The partitions affected by the load-balancing operation(s) are source partitions hosted on a primary storage stamp and/or destination partitions hosted on a secondary storage stamp, where the primary and secondary storage stamps are located in geographically distinct areas and are equipped to replicate the storage account's data therebetween. The load-balancing operation(s) include splitting partitions into child partitions upon detecting an increased workload as a result of active replication, merging partitions to form parent partitions upon detecting a reduction in workload as a result of decreased processing-related resource consumption, or offloading partitions based on resource consumption. A service within a partition layer of the storage stamps is responsible for determining when to invoke these load-balancing operation(s).