Abstract:
A system and method for processing a group and aggregate query on a relation are disclosed. A database system determines whether assistance of a heterogeneous system (HS) of compute nodes is beneficial in performing the query. Assuming that the relation has been partitioned and loaded into the HS, the database system determines, in a compile phase, whether the HS has the functional capabilities to assist, and whether the cost and benefit favor performing the operation with the assistance of the HS. If the cost and benefit favor using the assistance of the HS, then the system enters the execution phase. The database system starts, in the execution phase, an optimal number of parallel processes to produce and consume the results from the compute nodes of the HS. After any needed transaction consistency checks, the results of the query are returned by the database system.
Abstract:
Data recovery for a compute node in a heterogeneous database system is provided. A failure is detected of a particular compute node of a compute cluster comprising a plurality of compute nodes. The compute cluster is configured to store, in memory, data stored by a RDBMS. Particular data of the data stored by the RDBMS is identified that is assigned to the particular compute node. The particular compute node is restored. After restoring the particular compute node, the particular data assigned to the particular compute node is reloaded without taking the particular data offline. During reloading, the particular compute node receives pending modified data comprising data of the particular data that was modified during said reloading.
Abstract:
A method and apparatus for data recovery for a RDBMS instance in a heterogeneous database system is provided. A failure of a first RDBMS instance is detected in a plurality of RDBMS instances of a shared-disk database system. A compute cluster is configured to store, in memory, one or more tables stored by the shared-disk database system. The first RDBMS instance is configured to modify the one or more tables stored by the shared-disk database system and transfer modified data to the compute cluster to update the one or more tables at the compute cluster. After detecting the failure of the first RDBMS instance, redo records generated by the first RDBMS instance are scanned, pending modified data that was not transferred to the compute cluster before the failure is identified, and the pending modified data is transferred to the compute cluster.
Abstract:
A system and method for processing a group and aggregate query on a relation are disclosed. A database system determines whether assistance of a heterogeneous system (HS) of compute nodes is beneficial in performing the query. Assuming that the relation has been partitioned and loaded into the HS, the database system determines, in a compile phase, whether the HS has the functional capabilities to assist, and whether the cost and benefit favor performing the operation with the assistance of the HS. If the cost and benefit favor using the assistance of the HS, then the system enters the execution phase. The database system starts, in the execution phase, an optimal number of parallel processes to produce and consume the results from the compute nodes of the HS. After any needed transaction consistency checks, the results of the query are returned by the database system.
Abstract:
A two-level cache to facilitate resolving resource path expressions for a hierarchy of resources is described, which includes a system-wide shared cache and a session-level cache. The shared cache is organized as a hierarchy of hash tables that mirrors the structure of a repository hierarchy. A particular hash table in a shared cache includes information for the child resources of a particular resource. A database management system that manages a shared cache may control the amount of memory used by the cache by implementing a replacement policy for the cache based on one or more characteristics of the resources in the repository. The session-level cache is a single level cache in which information for target resources of resolved path expressions may be tracked. In the session-level cache, the resource information is associated with the entire path expression of the associated resource.
Abstract:
Techniques related to a sparse dictionary tree are disclosed. In some embodiments, computing device(s) execute instructions, which are stored on non-transitory storage media, for performing a method. The method comprises storing an encoding dictionary as a token-ordered tree comprising a first node and a second node, which are adjacent nodes. The token-ordered tree maps ordered tokens to ordered codes. The ordered tokens include a first token and a second token. The ordered codes include a first code and a second code, which are non-consecutive codes. The first node maps the first token to the first code. The second node maps the second token to the second code. The encoding dictionary is updated based on inserting a third node between the first node and the second node. The third node maps a third token to a third code that is greater than the first code and less than the second code.
Abstract:
Techniques are described for storing and maintaining, in a materialized view, bitmap data that represents a bitmap of each possible distinct value of an expression and rewriting a query for a count of distinct values of the expression using the materialized view. The materialized view contains bitmap data that represents a bitmap of each possible distinct value of a first expression, and aggregate values of additional expressions, and is stored in memory or on disk by a database system. The database system receives a query that requests a number of distinct values, of the first expression, and an aggregate value for an additional expression. In response, the database system, rewrites the query to: compute the number of distinct values by counting the bits in the bitmap data of the materialized view that are set to the first value, and obtains the aggregate value for the additional expression in the materialized view.
Abstract:
A method for distributing tables to a cluster of nodes managed by database management system (DBMS), is disclosed. Multiple data placement schemes are evaluated based on a query workload set to select a data placement scheme for the cluster of nodes. Tables, used in join operations in the workload set, are selected for evaluation of data placement schemes. Query execution costs for the workload set are generated based on estimating a query execution cost for each data placement scheme for the tables. The data placement scheme that has least costly estimated execution cost for the workload set is selected as the data placement scheme for the cluster of nodes managed by DBMS.
Abstract:
Herein is described a data placement scheme for a distributed query processing systems that achieves load balance amongst the nodes of the system. To identify a node on which to place particular data, a supervisor node performs a placement algorithm over the particular data's identifier, where the placement algorithm utilizes two or more hash functions. The supervisor node runs the placement algorithm until a destination node is identified that is available to store the data, or the supervisor node has run the placement algorithm an established number of times. If no available node is identified using the placement algorithm, then an available destination node is identified for the particular data and information identifying the data and the selected destination node is included in an exception map. Most data may be located by any node in the system based on the node performing the placement algorithm for the required data.
Abstract:
Techniques for performing database operations using vectorized instructions are provided. In one technique, an aggregation operation involves executing vectorized instructions to update a data value that corresponds to a particular key. The aggregation operation may be one of count, sum, minimum, maximum, or average.