Abstract:
Partitioning query execution work of a sequence including a plurality of elements. A method includes a worker core requesting work from a work queue. In response, the worker core receives a task from the work queue. The task is a replicable sequence-processing task including two distinct steps: scheduling a copy of the task on the scheduler queue and processing a sequence. The worker core processes the task by: creating a replica of the task and placing the replica of the task on the work queue, and beginning processing the sequence. The acts are repeated for one or more additional worker cores, where receiving a task from the work queue is performed by receiving one or more replicas of tasks placed on the task queue by earlier performances of creating a replica of the task and placing the replica of the task on the work queue by a different worker core.
Abstract:
A concurrent grouping operation for execution on a multiple core processor is provided. The grouping operation is provided with a sequence or set of elements. In one phase, each worker receives a partition of a sequence of elements to be grouped. The elements of each partition are arranged into a data structure, which includes one or more keys where each key corresponds to a value list of one or more of the received elements associated with that key. In another phase, the data structures created by each worker are merged so that the keys and corresponding elements for the entire sequence of elements exist in one data structure. Recursive merging can be completed in a constant time, which is not proportional to the length of the sequence.
Abstract:
A method of translating a comprehension into executable code for execution on a SIMD (Single Instruction, Multiple Data stream) execution unit, includes receiving a user specified comprehension. The comprehension is compiled into a first set of executable code. An intermediate representation is generated based on the first set of executable code. The intermediate representation is translated into a second set of executable code that is configured to be executed by a SIMD execution unit.
Abstract:
A method includes compiling an expression into executable code that is configured to create a data structure that represents the expression. The expression includes a plurality of sub-expressions. The code is executed to create the data structure. The data structure is evaluated using a plurality of concurrent threads, thereby processing the expression in a parallel manner.
Abstract:
Dynamically allocated thread storage in a computing device is disclosed. The dynamically allocated thread storage is configured to work with a process including two or more threads. Each thread includes a statically allocated thread-local slot configured to store a table. Each table is configured to include a table slot corresponding with a dynamically allocated thread-local value. A dynamically allocated thread-local instance corresponds with the table slot.
Abstract:
A method includes producing values with a producer thread, and providing a queue data structure including a first array of storage locations for storing the values. The first array has a first tail pointer and a first linking pointer. If a number of values stored in the first array is less than a capacity of the first array, an enqueue operation writes a new value at a storage location pointed to by the first tail pointer and advances the first tail pointer. If the number of values stored in the first array is equal to the capacity of the first array, a second array of storage locations is allocated in the queue. The second array has a second tail pointer. The first array is linked to the second array with the first linking pointer. An enqueue operation writes the new value at a storage location pointed to by the second tail pointer and advances the second tail pointer.
Abstract:
Dynamic data partitioning is disclosed for use with a multiple node processing system that consumes items from a data stream of any length and independent of whether the length is undeclared. Dynamic data partitioning takes items from the data stream when a thread is idle and assigns the taken items to an idle thread, and it varies the size of data chunks taken from the stream and assigned to a thread to efficiently distribute work loads among the nodes. In one example, data chunk sizes taken from the beginning of the data stream are relatively smaller than data chunk sizes taken towards the middle or end of the data stream. Dynamic data partitioning employs a growth function where chunks have a size related to single aligned cache lines and efficiently increases the size of the data chunks to occasionally double the amount of data assigned to concurrent threads.
Abstract:
The present invention extends to methods, systems, and computer program products for propagating unhandled exceptions in distributed execution environments, such as clusters. A job (e.g., a query) can include a series of computation steps that are executed on multiple compute nodes each processing parts of a distributed data set. Unhandled exceptions can be caught while computations are running on data partitions of different compute nodes. Unhandled exception objects can be stored in a serialized format in a compute node's local storage (or an alternate central location) along with auxiliary details such as the data partition being processed at the time. Stored serialized exception objects for a job can be harvested and aggregated in a single container object. The single container object can be passed back to the client.
Abstract:
A query that identifies an input data source is received. The input data source is partitioned into a plurality of partitions. Each of the partitions includes a set of data elements with an associated set of indices for indicating an ordering of the data elements. A query type for a query operator in the received query is identified. It is determined whether a reordering of data elements will be performed based on the identified query type. The data elements in at least one of the partitions are reordered when it is determined based on the identified query type that reordering will be performed.
Abstract:
A method of analyzing a data parallel query includes receiving a user-specified data parallel query that includes a plurality of query operators. An operator type for each of the query operators is identified based on a type of parallel input data structure the operator operates on and a type of parallel output data structure the operator outputs. It is determined whether the query is a prohibited query based on the identified operator types.