Query and change propagation scheduling for heterogeneous database systems

    公开(公告)号:US11475006B2

    公开(公告)日:2022-10-18

    申请号:US15368345

    申请日:2016-12-02

    Abstract: Techniques are presented herein for efficient query processing and data change propagation at a secondary database system. The techniques involve determining execution costs for executing a query at a primary DBMS and for executing the query at an offload DBMS. The cost for executing the query at the offload DBMS includes the cost of propagating changes to database objects required by the query to the offload DBMS. Based on the execution cost, the query is sent to either the primary DBMS or the offload DBMS.

    Massively parallel and in-memory execution of grouping and aggregation in a heterogeneous system

    公开(公告)号:US10204140B2

    公开(公告)日:2019-02-12

    申请号:US13831122

    申请日:2013-03-14

    Abstract: A system and method for processing a group and aggregate query on a relation are disclosed. A database system determines whether assistance of a heterogeneous system (HS) of compute nodes is beneficial in performing the query. Assuming that the relation has been partitioned and loaded into the HS, the database system determines, in a compile phase, whether the HS has the functional capabilities to assist, and whether the cost and benefit favor performing the operation with the assistance of the HS. If the cost and benefit favor using the assistance of the HS, then the system enters the execution phase. The database system starts, in the execution phase, an optimal number of parallel processes to produce and consume the results from the compute nodes of the HS. After any needed transaction consistency checks, the results of the query are returned by the database system.

    CONSISTENT QUERY EXECUTION FOR BIG DATA ANALYTICS IN A HYBRID DATABASE

    公开(公告)号:US20180349458A1

    公开(公告)日:2018-12-06

    申请号:US15610171

    申请日:2017-05-31

    CPC classification number: G06F16/273 G06F16/2365 G06F16/2379 G06F16/2455

    Abstract: Techniques are described for efficient query processing and data change propagation to a secondary database system. The secondary database system may execute queries received at a primary database system. Database changes made at the primary system are copied to the secondary system. The primary system receives a query to be executed on either the primary system or the secondary system. The primary system determines whether to send the query to the secondary system based upon whether data objects stored within the secondary system have pending changes that need to be applied to the data objects. The pending changes are stored within in-memory journals within the primary system. The primary system scans for the pending changes to the data objects and sends the pending changes to the secondary system. The secondary system then receives and applies the pending changes to the data objects within the secondary system. Upon applying the pending changes, the secondary system executes the query.

    Version control based on a dual-range validity model

    公开(公告)号:US09811560B2

    公开(公告)日:2017-11-07

    申请号:US14824920

    申请日:2015-08-12

    CPC classification number: G06F17/30448 G06F17/30345 G06F17/30353

    Abstract: Techniques related to version control based on a dual-range validity model are disclosed. In an embodiment, an online analytical processing (OLAP) server stores a plurality of version records describing versions of a data item. A version record may describe any open transactions for a version of the data item. The version record may specify a commit timestamp for the data item at a database and a valid timestamp at least as great as the commit timestamp. The commit timestamp and the valid timestamp may specify a validity range. The version record may also specify an expiration timestamp, which along with the valid timestamp may specify an unresolved range. The OLAP server may also identify a valid version of the data item for a query timestamp that corresponds to a query for particular data in the data item and that falls within either the validity range or the unresolved range.

    Efficient pushdown of joins in a heterogeneous database system involving a large-scale low-power cluster
    6.
    发明授权
    Efficient pushdown of joins in a heterogeneous database system involving a large-scale low-power cluster 有权
    在涉及大规模低功率集群的异构数据库系统中有效的下联连接

    公开(公告)号:US08849871B2

    公开(公告)日:2014-09-30

    申请号:US13645030

    申请日:2012-10-04

    CPC classification number: G06F17/30289 G06F17/30498 G06F17/30598

    Abstract: A system and method for allocating join processing between and RDBMS and an assisting cluster. In one embodiment, the method estimates a cost of performing the join completely in the RDBMS and the cost of performing the join with the assistance of a cluster coupled to the RDBMS. The cost of performing the join with the assistance of the cluster includes estimating a cost of a broadcast join or a partition join depending on the sizes of the tables. Additional costs are incurred when there is a blocking operation, which prevents the cluster from being able to process portions of the join. The RDBMS also maintains transactional consistency when the cluster performs some or all of the join processing.

    Abstract translation: 用于在RDBMS和辅助群集之间分配连接处理的系统和方法。 在一个实施例中,该方法估计在RDBMS中完全执行连接的成本以及在耦合到RDBMS的集群的协助下执行连接的成本。 在集群的帮助下执行连接的成本包括根据表的大小来估计广播联接或分区连接的成本。 当有阻塞操作时会产生额外的成本,从而防止集群处理部分连接。 当集群执行部分或全部连接处理时,RDBMS还维护事务一致性。

    Consistent query execution for big data analytics in a hybrid database

    公开(公告)号:US10691722B2

    公开(公告)日:2020-06-23

    申请号:US15610171

    申请日:2017-05-31

    Abstract: Techniques are described for efficient query processing and data change propagation to a secondary database system. The secondary database system may execute queries received at a primary database system. Database changes made at the primary system are copied to the secondary system. The primary system receives a query to be executed on either the primary system or the secondary system. The primary system determines whether to send the query to the secondary system based upon whether data objects stored within the secondary system have pending changes that need to be applied to the data objects. The pending changes are stored within in-memory journals within the primary system. The primary system scans for the pending changes to the data objects and sends the pending changes to the secondary system. The secondary system then receives and applies the pending changes to the data objects within the secondary system. Upon applying the pending changes, the secondary system executes the query.

    Multi-system query execution plan

    公开(公告)号:US10585887B2

    公开(公告)日:2020-03-10

    申请号:US14673560

    申请日:2015-03-30

    Abstract: Techniques are described to evaluate an operation from an execution plan of a query to offload the operation to another database management system for less costly execution. In an embodiment, the execution plan is determined based on characteristics of the database management system that received the query for execution. One or more operations in the execution plan are then evaluated for offloading to another heterogeneous database management system. In a related embodiment, the offloading cost for each operation may also include communication cost between the database management systems. The operations that are estimated to be less costly to execute on the other database management system are then identified for offloading to the other database management system. In an alternative embodiment, the database management system generates permutations of execution plans for the same query, and similarly evaluates each permutation of the execution plans for offloading its one or more operations. Based on the total cost of each permutation, which may include offloading cost for one or more operations to another database management system, the least costly plan is selected for the query execution.

    DISTRIBUTED RELATIONAL DICTIONARIES
    10.
    发明申请

    公开(公告)号:US20190205446A1

    公开(公告)日:2019-07-04

    申请号:US15861212

    申请日:2018-01-03

    Abstract: Techniques related to distributed relational dictionaries are disclosed. In some embodiments, one or more non-transitory storage media store a sequence of instructions which, when executed by one or more computing devices, cause performance of a method. The method involves generating, by a query optimizer at a distributed database system (DDS), a query execution plan (QEP) for generating a code dictionary and a column of encoded database data. The QEP specifies a sequence of operations for generating the code dictionary. The code dictionary is a database table. The method further involves receiving, at the DDS, a column of unencoded database data from a data source that is external to the DDS. The DDS generates the code dictionary according to the QEP. Furthermore, based on joining the column of unencoded database data with the code dictionary, the DDS generates the column of encoded database data according to the QEP.

Patent Agency Ranking