摘要:
Inverted indexes for terms and for term separators are separately provided to minimize data redundancy. Search queries are parsed to identify terms and term separators, if any, and the corresponding inverted indexes are searched for responsive documents. Related apparatus, systems, techniques and articles are also described.
摘要:
Inverted indexes for terms and for term separators are separately provided to minimize data redundancy. Search queries are parsed to identify terms and term separators, if any, and the corresponding inverted indexes are searched for responsive documents. Related apparatus, systems, techniques and articles are also described.
摘要:
Inverted indexes for terms and for term separators are separately provided to minimize data redundancy. Search queries are parsed to identify terms and term separators, if any, and the corresponding inverted indexes are searched for responsive documents. Related apparatus, systems, techniques and articles are also described.
摘要:
Inverted indexes for terms and for term separators are separately provided to minimize data redundancy. Search queries are parsed to identify terms and term separators, if any, and the corresponding inverted indexes are searched for responsive documents. Related apparatus, systems, techniques and articles are also described.
摘要:
Methods and apparatus, including computer program products, for block compression of tables with repeated values. In general, value identifiers representing a compressed column of data may be sorted to render repeated values contiguous, and block dictionaries may be generated. A block dictionary may be generated for each block of value identifiers. Each block dictionary may include a list of block identifiers, where each block identifier is associated with a value identifier and there is a block identifier for each unique value in a block. Blocks may have standard sizes and block dictionaries may be reused for multiple blocks.
摘要:
Innovations for adaptive compression and decompression for dictionaries of a column-store database can reduce the amount of memory used for columns of the database, allowing a system to keep column data in memory for more columns, while delays for access operations remain acceptable. For example, dictionary compression variants use different compression techniques and implementation options. Some dictionary compression variants provide more aggressive compression (reduced memory consumption) but result in slower run-time performance. Other dictionary compression variants provide less aggressive compression (higher memory consumption) but support faster run-time performance. As another example, a compression manager can automatically select a dictionary compression variant for a given column in a column-store database. For different dictionary compression variants, the compression manager predicts run-time performance and compressed dictionary size, given the values of the column, and selects one of the dictionary compression variants.
摘要:
Methods and apparatus, including computer program products, are provided for providing for processing calculation plans. In one aspect, there is provided a computer-implemented method. The method may include generating a calculation plan including a plurality of nodes; determining whether at least one of the nodes includes a function node; and compiling the function node into executable code to enable execution of the plurality of nodes including the function node at the database. Related apparatus, systems, methods, and articles are also described.
摘要:
A pattern can be identified in at least part of a query whose definition is received in a query request. The identified pattern can be matched with a set of pre-defined patterns, each of which has associated therewith at least one pre-compiled query execution sub-component of a plurality of pre-compiled query execution sub-components retained in a library. A plan for executing the query can be generated, for example by incorporating the pre-compiled query execution sub-component associated with the matched pattern into the plan based on a pseudo code representation of the plan derived from the definition.
摘要:
The subject matter described herein relates to implementation of a dictionary in a column-based, in-memory database where values are not stored directly, rather, for each column, a dictionary is created with all distinct values. For each row, a reference to the corresponding value in the dictionary is stored. In one aspect, data is stored in a memory structure organized in a column store format defined by a plurality of columns and a plurality of rows. A dictionary for each column in the memory structure is generated. The dictionary has distinct values for each column. A reference to the dictionary is generated for each column in the memory structure. The dictionary and the reference to the dictionary are stored in the memory structure.
摘要:
Methods and apparatus, including computer program products, for block compression of tables with repeated values. In general, value identifiers representing a compressed column of data may be sorted to render repeated values contiguous, and block dictionaries may be generated. A block dictionary may be generated for each block of value identifiers. Each block dictionary may include a list of block identifiers, where each block identifier is associated with a value identifier and there is a block identifier for each unique value in a block. Blocks may have standard sizes and block dictionaries may be reused for multiple blocks.