摘要:
Techniques are described for managing distributed execution of programs. In some situations, the techniques include determining configuration information to be used for executing a particular program in a distributed manner on multiple computing nodes and/or include providing information and associated controls to a user regarding ongoing distributed execution of one or more programs to enable the user to modify the ongoing distributed execution in various manners. Determined configuration information may include, for example, configuration parameters such as a quantity of computing nodes and/or other measures of computing resources to be used for the executing, and may be determined in various manners, including by interactively gathering values for at least some types of configuration information from an associated user (e.g., via a GUI that is displayed to the user) and/or by automatically determining values for at least some types of configuration information (e.g., for use as recommendations to a user).
摘要:
Techniques are described for managing distributed execution of programs. In some situations, the techniques include dynamically modifying the distributed program execution in various manners, such as based on monitored status information. The dynamic modifying of the distributed program execution may include adding and/or removing computing nodes from a cluster that is executing the program, modifying the amount of computing resources that are available for the distributed program execution, terminating or temporarily suspending execution of the program (e.g., if an insufficient quantity of computing nodes of the cluster are available to perform execution), etc.
摘要:
Techniques are described for managing distributed execution of programs. In some situations, the techniques include dynamically monitoring the ongoing distributed execution of a program on a cluster of multiple computing nodes, and may include automatically determining the status of execution of the program on each of the multiple computing nodes and/or automatically determining the aggregate usage of one or more types of computing resources across the cluster of multiple computing nodes by the distributed program execution. The information obtained from the dynamic monitoring may be used in various manners, including to facilitate dynamically modifying the ongoing distributed program execution in various manners, such as to temporarily throttle usage of computing resources by the distributed program execution (e.g., to remove or reduce one or more bottlenecks).
摘要:
Techniques are described for managing distributed execution of programs. In some situations, the techniques include dynamically modifying the distributed program execution in various manners, such as based on monitored status information. The dynamic modifying of the distributed program execution may include adding and/or removing computing nodes from a cluster that is executing the program, modifying the amount of computing resources that are available for the distributed program execution, terminating or temporarily suspending execution of the program (e.g., if an insufficient quantity of computing nodes of the cluster are available to perform execution), etc.
摘要:
Techniques are described for managing distributed execution of programs. In at least some situations, the techniques include decomposing or otherwise separating the execution of a program into multiple distinct execution jobs that may each be executed on a distinct computing node, such as in a parallel manner with each execution job using a distinct subset of input data for the program. In addition, the techniques may include temporarily terminating and later resuming execution of at least some execution jobs, such as by persistently storing an intermediate state of the partial execution of an execution job, and later retrieving and using the stored intermediate state to resume execution of the execution job from the intermediate state. Furthermore, the techniques may be used in conjunction with a distributed program execution service that executes multiple programs on behalf of multiple customers or other users of the service.
摘要:
Techniques are described for managing distributed execution of programs. In at least some situations, the techniques include decomposing or otherwise separating the execution of a program into multiple distinct execution jobs that may each be executed on a distinct computing node, such as in a parallel manner with each execution job using a distinct subset of input data for the program. In addition, the techniques may include temporarily terminating and later resuming execution of at least some execution jobs, such as by persistently storing an intermediate state of the partial execution of an execution job, and later retrieving and using the stored intermediate state to resume execution of the execution job from the intermediate state. Furthermore, the techniques may be used in conjunction with a distributed program execution service that executes multiple programs on behalf of multiple customers or other users of the service.
摘要:
Techniques are described for managing distributed execution of programs, including by dynamically scaling a cluster of multiple computing nodes used to perform ongoing distributed execution of a program, such as to increase and/or decrease the quantity of computing nodes in the cluster at various times and for various reasons. An architecture may be used that facilitates the dynamic scaling of a cluster, including by having at least some of the computing nodes act as core nodes that each participate in a distributed storage system for the distributed program execution, and having one or more other computing nodes that act as auxiliary nodes that do not participate in the distributed storage system. If computing nodes are selected to be removed from the cluster during ongoing distributed execution of a program, one or more nodes of the auxiliary computing node type may be selected for the removal.
摘要:
Techniques are described for managing distributed execution of programs, including by dynamically scaling a cluster of multiple computing nodes performing ongoing distributed execution of a program, such as to increase and/or decrease computing node quantity. An architecture may be used that has core nodes that each participate in a distributed storage system for the distributed program execution, and that has one or more other auxiliary nodes that do not participate in the distributed storage system. Furthermore, as part of performing the dynamic scaling of a cluster, computing nodes that are only temporarily available may be selected and used, such as computing nodes that might be removed from the cluster during the ongoing program execution to be put to other uses and that may also be available for a different fee (e.g., a lower fee) than other computing nodes that are available throughout the ongoing use of the cluster.
摘要:
Techniques are described for managing execution of programs, such as for distributed execution of a program on multiple computing nodes. In some situations, the techniques include selecting a cluster of computing nodes to use for executing a program based at least in part on data to be used during the program execution. For example, the computing node selection for a particular program may be performed so as to attempt to identify and use computing nodes that already locally store some or all of the input data that will be used by those computing nodes as part of the executing of that program on those nodes. Such techniques may provide benefits in a variety of situations, including when the size of input datasets to be used by a program are large, and the transferring of data to and/or from computing nodes may impose large delays and/or monetary costs.
摘要:
Techniques are described for providing clients with access to functionality for creating, configuring and executing defined workflows that manipulate source data in defined manners, such as under the control of a configurable workflow service that is available to multiple remote clients over one or more public networks. A defined workflow for a client may, for example, include multiple interconnected workflow components that are specified by the client and that each are configured to perform one or more types of data manipulation operations on a specified type of input data. The configurable workflow service may further execute the defined workflow at one or more times and in one or more manners, such as in some situations by provisioning multiple computing nodes provided by the configurable workflow service to each implement at least one of the workflow components for the defined workflow.