摘要:
A scheduler receives a job graph which includes a graph of computational vertices that are designed to be executed on multiple distributed computer systems. The scheduler queries a graph manager to determine which computational vertices of the job graph are ready for execution in a local execution environment. The scheduler queries a cluster manager to determine the organizational topology of the distributed computer systems to simulate the determined topology in the local execution environment. The scheduler queries a data manager to determine data storage locations for each of the computational vertices indicated as being ready for execution in the local execution environment. The scheduler also indicates to a vertex spawner that an instance of each computational vertex is to be spawned in the local execution environment based on the organizational topology and indicated data storage locations, and indicates to the local execution environment that the spawned vertices are to be executed.
摘要:
A scheduler receives a job graph which includes a graph of computational vertices that are designed to be executed on multiple distributed computer systems. The scheduler queries a graph manager to determine which computational vertices of the job graph are ready for execution in a local execution environment. The scheduler queries a cluster manager to determine the organizational topology of the distributed computer systems to simulate the determined topology in the local execution environment. The scheduler queries a data manager to determine data storage locations for each of the computational vertices indicated as being ready for execution in the local execution environment. The scheduler also indicates to a vertex spawner that an instance of each computational vertex is to be spawned in the local execution environment based on the organizational topology and indicated data storage locations, and indicates to the local execution environment that the spawned vertices are to be executed.
摘要:
The present invention extends to methods, systems, and computer program products for propagating unhandled exceptions in distributed execution environments, such as clusters. A job (e.g., a query) can include a series of computation steps that are executed on multiple compute nodes each processing parts of a distributed data set. Unhandled exceptions can be caught while computations are running on data partitions of different compute nodes. Unhandled exception objects can be stored in a serialized format in a compute node's local storage (or an alternate central location) along with auxiliary details such as the data partition being processed at the time. Stored serialized exception objects for a job can be harvested and aggregated in a single container object. The single container object can be passed back to the client.
摘要:
Embodiments are directed to implementing custom operators in a query for a parallel query engine and to generating a partitioned representation of a sequence of query operators in a parallel query engine. A computer system receives a portion of partitioned input data at a parallel query engine, where the parallel query engine is configured to process data queries in parallel, and where the queries include a sequence of built-in operators. The computer system incorporates a custom operator into the sequence of built-in operators for a query and accesses the sequence of operators to determine how the partitioned input data is to be processed. The custom operator is accessed in the same manner as the built-in operators. The computer system also processes the sequence of operators including both the built-in operators and at least one custom operator according to the determination indicating how the data is to be processed.
摘要:
The present invention extends to methods, systems, and computer program products for partitioning streaming data. Embodiments of the invention can be used to hash partition a stream of data and thus avoids unnecessary memory usage (e.g., associated with buffering). Hash partitioning can be used to split an input sequence (e.g., a data stream) into multiple partitions that can be processed independently. Other embodiments of the invention can be used to hash repartition a plurality of streams of data. Hash repartitioning converts a set of partitions into another set of partitions with the hash partitioned property. Partitioning and repartitioning can be done in a streaming manner at runtime by exchanging values between worker threads responsible for different partitions.
摘要:
The present invention extends to methods, systems, and computer program products for indicating parallel operations with user-visible events. Event markers can be used to indicate an abstracted outer layer of execution as well as expose internal specifics of parallel processing systems, including systems that provide data parallelism. Event markers can be used to show a variety of execution characteristics including higher-level markers to indicate the beginning and end of an execution program (e.g., a query). Inside the execution program (query) individual fork/join operations can be indicated with sub-levels of markers to expose their operations. Additional decisions made by an execution engine, such as, for example, when elements initially yield, when queries overlap or nest, when the query is cancelled, when the query bails to sequential operation, when premature merging or re-partitioning are needed can also be exposed.
摘要:
Dynamically allocated thread storage in a computing device is disclosed. The dynamically allocated thread storage is configured to work with a process including two or more threads. Each thread includes a statically allocated thread-local slot configured to store a table. Each table is configured to include a table slot corresponding with a dynamically allocated thread-local value. A dynamically allocated thread-local instance corresponds with the table slot.
摘要:
A method includes producing values with a producer thread, and providing a queue data structure including a first array of storage locations for storing the values. The first array has a first tail pointer and a first linking pointer. If a number of values stored in the first array is less than a capacity of the first array, an enqueue operation writes a new value at a storage location pointed to by the first tail pointer and advances the first tail pointer. If the number of values stored in the first array is equal to the capacity of the first array, a second array of storage locations is allocated in the queue. The second array has a second tail pointer. The first array is linked to the second array with the first linking pointer. An enqueue operation writes the new value at a storage location pointed to by the second tail pointer and advances the second tail pointer.
摘要:
Dynamic data partitioning is disclosed for use with a multiple node processing system that consumes items from a data stream of any length and independent of whether the length is undeclared. Dynamic data partitioning takes items from the data stream when a thread is idle and assigns the taken items to an idle thread, and it varies the size of data chunks taken from the stream and assigned to a thread to efficiently distribute work loads among the nodes. In one example, data chunk sizes taken from the beginning of the data stream are relatively smaller than data chunk sizes taken towards the middle or end of the data stream. Dynamic data partitioning employs a growth function where chunks have a size related to single aligned cache lines and efficiently increases the size of the data chunks to occasionally double the amount of data assigned to concurrent threads.
摘要:
A query that identifies an input data source is received. The input data source is partitioned into a plurality of partitions. Each of the partitions includes a set of data elements with an associated set of indices for indicating an ordering of the data elements. A query type for a query operator in the received query is identified. It is determined whether a reordering of data elements will be performed based on the identified query type. The data elements in at least one of the partitions are reordered when it is determined based on the identified query type that reordering will be performed.