Split processing paths for a database calculation engine

    公开(公告)号:US10146834B2

    公开(公告)日:2018-12-04

    申请号:US14518593

    申请日:2014-10-20

    IPC分类号: G06F17/30

    摘要: A dynamic split node defined within a calculation model can receive data being operated on by a calculation plan generated based on the calculation model. A partition specification can be applied to one or more reference columns in a table containing at least some of the received data. The applying can cause the table to be split such that a plurality of records in the table are partitioned according to the partition specification. A separate processing path can be set for each partition, and execution of the calculation plan can continue using the separate processing paths, each of which can be assigned to a processing node of a plurality of available processing nodes.

    DYNAMIC RANGE PARTITIONING
    3.
    发明申请
    DYNAMIC RANGE PARTITIONING 有权
    动态范围划分

    公开(公告)号:US20160055192A1

    公开(公告)日:2016-02-25

    申请号:US14463060

    申请日:2014-08-19

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30339

    摘要: A system includes generation of a definition of a table including a partitioning column of the table and a threshold size, allocation of a first memory partition for the table, determination that a size of the records of the table in the first memory partition is greater than the threshold size, and, in response to the determination that the size of the records of the table in the first memory partition is greater than the threshold size, determination that a maximum value of the partitioning column in the records of the table in the first memory partition, determination of a minimum value of the partitioning column in the records of the table in the first memory partition, generation of metadata indicating that records of the table in which the value of the partitioning column is in a range between and including the minimum value and the maximum value are stored in the first memory partition, and allocation of a second memory partition for the table.

    摘要翻译: 系统包括生成包括表的分区列和阈值大小的表的定义,用于表的第一存储器分区的分配,确定第一存储器分区中的表的记录的大小大于 阈值大小,并且响应于第一存储器分区中的表的记录的大小大于阈值大小的确定,确定在第一存储器分区中的表的记录中的分区列的最大值 存储器分区,确定第一存储器分区中的表的记录中的分区列的最小值,生成表示分区列的值在其中包括最小值的范围内的表的记录的元数据 值和最大值存储在第一个内存分区中,并为表分配第二个内存分区。

    HASH-JOIN IN PARALLEL COMPUTATION ENVIRONMENTS
    5.
    发明申请
    HASH-JOIN IN PARALLEL COMPUTATION ENVIRONMENTS 有权
    并行计算环境中的HASH-JOIN

    公开(公告)号:US20120011108A1

    公开(公告)日:2012-01-12

    申请号:US12978044

    申请日:2010-12-23

    IPC分类号: G06F17/30

    摘要: According to some embodiments, a system and method for a parallel join of relational data tables may be provided by calculating, by a plurality of concurrently executing execution threads, hash values for join columns of a first input table and a second input table; storing the calculated hash values in a set of disjoint thread-local hash maps for each of the first input table and the second input table; merging the set of thread-local hash maps of the first input table, by a second plurality of execution threads operating concurrently, to produce a set of merged hash maps; comparing each entry of the merged hash maps to each entry of the set of thread-local hash maps for the second input table to determine whether there is a match, according to a join type; and generating an output table including matches as determined by the comparing.

    摘要翻译: 根据一些实施例,可以通过由多个并发执行执行线程计算第一输入表和第二输入表的连接列的散列值来提供用于关系数据表的并行连接的系统和方法; 将所计算的散列值存储在所述第一输入表和所述第二输入表中的每一个的一组不相交的线程局部散列图中; 通过并行操作的第二多个执行线程来合并第一输入表的一组线程局部散列图,以产生一组合并的散列图; 将合并的散列映射的每个条目与第二输入表的线程局部散列映射集合的每个条目进行比较,以根据连接类型确定是否存在匹配; 以及生成包括通过比较确定的匹配的输出表。

    Replication mechanisms for database environments
    7.
    发明授权
    Replication mechanisms for database environments 有权
    数据库环境的复制机制

    公开(公告)号:US09411866B2

    公开(公告)日:2016-08-09

    申请号:US13719737

    申请日:2012-12-19

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30575

    摘要: Data replication in a database includes identifying a source database system. The source database includes a main index file and a delta log file. To create a replica, one or more symbolic links to the source database system are generated. The symbolic links identify a path to a physical location of the source database. A replica of the source database is generated based on the symbolic links. The replica includes a copy of the main index file and delta log file. Information associated with the replica and the symbolic links is stored in a recovery log. Replica are provided transparently to most database engine components by re-using partitioning infrastructure. Components “see” replica as tables with a single partition; that partition is a local replica.

    摘要翻译: 数据库中的数据复制包括识别源数据库系统。 源数据库包括主索引文件和增量日志文件。 要创建副本,将生成到源数据库系统的一个或多个符号链接。 符号链接标识到源数据库的物理位置的路径。 基于符号链接生成源数据库的副本。 副本包括主索引文件和增量日志文件的副本。 与副本和符号链接相关联的信息存储在恢复日志中。 通过重新使用分区基础架构,对大多数数据库引擎组件透明地提供副本。 组件“将”副本视为具有单个分区的表; 该分区是本地副本。

    Table placement in distributed databases
    8.
    发明授权
    Table placement in distributed databases 有权
    分布式数据库中的表放置

    公开(公告)号:US09372907B2

    公开(公告)日:2016-06-21

    申请号:US14090799

    申请日:2013-11-26

    IPC分类号: G06F17/30

    摘要: A node type of a plurality of distributed nodes to which a table to be added to a distributed database should be assigned can be identified by applying a set of placement rules defined for the table. The set of placement rules can also be applied to determine whether the table should be partitioned into more than one partition. A table group name associated with the table can be obtained and used in conjunction with the node type and determination of whether to partition the table to store the table in the distributed database on at least one node of the plurality of nodes as one or more partitions.

    摘要翻译: 可以通过应用为表定义的一组放置规则来识别要分配要添加到分布式数据库的表的多个分布式节点的节点类型。 还可以应用这套放置规则来确定表是否应该被分割成多个分区。 可以获得与表相关联的表组名称,并与节点类型一起使用并确定是否划分表以将表在多个节点中的至少一个节点上存储在分布式数据库中作为一个或多个分区 。

    Database Table Re-Partitioning Using Two Active Partition Specifications
    9.
    发明申请
    Database Table Re-Partitioning Using Two Active Partition Specifications 审中-公开
    使用两个活动分区规范的数据库表重新分区

    公开(公告)号:US20150242451A1

    公开(公告)日:2015-08-27

    申请号:US14188541

    申请日:2014-02-24

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30584 G06F17/30578

    摘要: Partitioning of source partitions of a table of a database to target partitions is initiated. Thereafter, a transition partition specification is specified that identifies the source partitions and the target partitions. Data is then moved (e.g., asynchronously moved, etc.) from the source partitions to the target partitions. Concurrently with the moving of the data, operates are handled using the transition partition specification. Subsequently, the source partitions are dropped when all of the data has been moved to the target partitions and there are no open transactions accessing the source partitions. Related apparatus, systems, techniques and articles are also described.

    摘要翻译: 启动将数据库表的源分区分区到目标分区。 此后,指定了标识源分区和目标分区的转换分区规范。 然后将数据从源分区移动(例如异步移动等)到目标分区。 同时随着数据的移动,操作使用转换分区规范进行处理。 随后,当所有数据已被移动到目标分区并且没有访问源分区的打开事务时,源分区将被丢弃。 还描述了相关设备,系统,技术和物品。

    TABLE PLACEMENT IN DISTRIBUTED DATABASES
    10.
    发明申请
    TABLE PLACEMENT IN DISTRIBUTED DATABASES 有权
    分布式数据库中的表格

    公开(公告)号:US20150149509A1

    公开(公告)日:2015-05-28

    申请号:US14090799

    申请日:2013-11-26

    IPC分类号: G06F17/30

    摘要: A node type of a plurality of distributed nodes to which a table to be added to a distributed database should be assigned can be identified by applying a set of placement rules defined for the table. The set of placement rules can also be applied to determine whether the table should be partitioned into more than one partition. A table group name associated with the table can be obtained and used in conjunction with the node type and determination of whether to partition the table to store the table in the distributed database on at least one node of the plurality of nodes as one or more partitions.

    摘要翻译: 可以通过应用为表定义的一组放置规则来识别要分配要添加到分布式数据库的表的多个分布式节点的节点类型。 还可以应用这套放置规则来确定表是否应该被分割成多个分区。 可以获得与表相关联的表组名称,并与节点类型一起使用并确定是否划分表以将表在多个节点中的至少一个节点上存储在分布式数据库中作为一个或多个分区 。