摘要:
An approach is provided in which a processor receives a scan request to scan data included in a data table. The processor selects a column in the data table corresponding to the scan request and retrieves column data entries from the selected column. In addition, the processor identifies the width of the selected column and selects a scan algorithm based upon the identified column width. In turn, the processor loads the column data entries into column data vectors and computes scan results from the column data vectors using the selected scan algorithm.
摘要:
A system joins predicate evaluated column bitmaps having varying lengths. The system includes a column unifier for querying column values with a predicate and generating an indicator bit for each of the column values that is then joined with the respective column value. The system also includes a bitmap generator for creating a column-major linear bitmap from the column values and indicator bits. The column unifier also determines an offset between adjacent indicator bits. The system also includes a converter for multiplying the column-major linear bitmap with a multiplier to shift the indicator bits into consecutive positions in the linear bitmap.
摘要:
Embodiments of the present invention provide a database processing system for efficient partitioning of a database table with column-major layout for executing one or more join operations. One embodiment comprises a method for partitioning a database table with column-major layout, partitioning only the join-columns by limiting the partitions by size and number, executing one or more join operations for joining the partitioned columns, and optionally de-partitioning the join result to the original order by sequentially writing and randomly reading table values using P cursors.
摘要:
Described herein are methods, systems, apparatuses and products for multithreaded data merging for multi-core central and graphical processing units. An aspect provides for executing a plurality of threads on at least one central processing unit comprising a plurality of cores, each thread comprising an input data set (IDS) and being executed on one of the plurality of cores; initializing at least one local data set (LDS) comprising a size and a threshold; inserting IDS data elements into the at least one LDS such that each inserted IDS data element increases the size of the at least one LDS; and merging the at least one LDS into a global data set (GDS) responsive to the size of the at least one LDS being greater than the threshold. Other aspects are disclosed herein.
摘要:
A cell-specific dictionary is applied adaptively to adequate cells, where the cell-specific dictionary subsequently optimizes the handling of frequency-partitioned multi-dimensional data. This includes improved data partitioning with super cells or adjusting resulting cells by sub-dividing very large cells and merging multiple small cells, both of which avoid the highly skewed data distribution in cells and improve the query processing. In addition, more efficient encoding is taught within a cell in case the distinct values that actually appear in that cell are much smaller than the size of the column dictionary.
摘要:
A predicate over a single column of a table is converted into at least one IN-list, wherein the IN-list is generated for a set of tuples of the column, and the generation is done over a data structure representing a set of distinct values of the column where the predicate applies and having a smaller cardinality than the table. The generated IN-list is evaluated over the set of tuples and the results of the evaluation are outputted as an evaluation of the predicate.
摘要:
According to one embodiment of the present invention, a method for dictionary encoding data without using three-valued logic is provided. According to one embodiment of the invention, a method includes encoding data in a database table using a dictionary, wherein the data includes values representing NULLs. A query having a predicate is received and the predicate is evaluated on the encoded data, whereby the predicate is evaluated on both the encoded data and on the encoded NULLs.
摘要:
A method, storage server, and computer readable medium for off-loading star-join operations from a host information processing system to a storage server. At least a first and second set of keys from a first and second dimension table, respectively are received from a host system. Each of the first and second set of keys is associated with at least one fact table. A set of locations associated with a set of foreign key indexes are received from the host system. A set of fact table indexes are traversed. At least a first set of Row Identifiers (“RIDs”) associated with the first set of keys and at least a second set of RIDs associated with the second set of keys are identified. An operation is performed on the first and second sets of RIDs to identify an intersecting set of RIDs. The intersecting set of RIDs are then stored.
摘要:
A method is disclosed that places data-intensive subprocesses in close physical and logical proximity to the facility responsible for storing the data, so that high efficiencies at reduced cost are achieved. In one specific example, new computer programs, termed adjuncts, are added and placed in a logical partition on a storage facility so that they can be invoked using appropriate commands issued on the I/O channel. Further, programs or changes are added to existing programs on the host machine, wherein such programs or changes discover the function extensions and invoke them to perform data processing.
摘要:
A way for progressively refining a query execution plan during query execution in a federated data system is provided. Re-optimization constraints are placed in the query execution plan during query compilation. When a re-optimization constraint is violated during query execution, a model of the query execution plan is refined using a partially executed query to form a new query execution plan. The new query execution plan is compiled. The compiled new query execution plan is executed.