摘要:
A probabilistic data structure is generated for efficient query processing using a histogram for unsorted data in a column of a columnar database. A bucket range size is determined for multiples buckets of a histogram of a column in a columnar database table. In at least some embodiments, the histogram may be a height-balanced histogram. A probabilistic data structure is generated to indicate for which particular buckets in the histogram there is a data value stored in the data block. When an indication of a query directed to the column for select data is received, the probabilistic data structure for each of the data blocks storing data for the column may be examined to determine particular ones of the data blocks which do not need to be read in order to service the query for the select data.
摘要:
The formulation of a merged sorted list from multiple input sorted lists in multiple phases using an array pair. Initially, the first array is contiguously populated with the input sorted lists. In the first phase, the first and second input sorted lists are merged into a first intermediary merged list within the second array. Each subsequent phase merges a prior intermediary merged list resulting from the prior phase and, a next input sorted list in the first array to generate a next intermediary merged list, or a merged sorted list if there or no further input in the first array. The intermediary merged lists alternate between the first array and the second array from one phase to the next phase.
摘要:
The disclosure notably relates to a computer-implemented method of storing RDF graph data in a graph database comprising a set of RDF tuples. The method comprises obtaining one or more adjacency matrices wherein each adjacency matrix represents a group of tuples of the graph database comprising a same predicate. The method further comprises storing, for each of the one or more adjacency matrices, a data structure comprising an array. The array comprises one or more indices each pointing to a sub-division of the adjacency matrix, and/or one or more elements each representing a group of tuples of the RDF graph database of a respective sub-division of the adjacency matrix.
摘要:
Techniques for processing a query are provided. One or more operations that are required to process a query are performed by a coprocessor that is separate from a general purpose microprocessor that executes query processing software. The query processing software receives a query, determines one or more operations that are required to be executed to fully process the query, and issues one or more commands to one or more coprocessors that are programmed to perform one of the operations, such as a table scan operation and/or a lookup operation. The query processing software obtains results from the coprocessor(s) and performs one or more additional operations thereon to generate a final result of the query.