HASH-JOIN IN PARALLEL COMPUTATION ENVIRONMENTS
    2.
    发明申请
    HASH-JOIN IN PARALLEL COMPUTATION ENVIRONMENTS 有权
    并行计算环境中的HASH-JOIN

    公开(公告)号:US20120011108A1

    公开(公告)日:2012-01-12

    申请号:US12978044

    申请日:2010-12-23

    IPC分类号: G06F17/30

    摘要: According to some embodiments, a system and method for a parallel join of relational data tables may be provided by calculating, by a plurality of concurrently executing execution threads, hash values for join columns of a first input table and a second input table; storing the calculated hash values in a set of disjoint thread-local hash maps for each of the first input table and the second input table; merging the set of thread-local hash maps of the first input table, by a second plurality of execution threads operating concurrently, to produce a set of merged hash maps; comparing each entry of the merged hash maps to each entry of the set of thread-local hash maps for the second input table to determine whether there is a match, according to a join type; and generating an output table including matches as determined by the comparing.

    摘要翻译: 根据一些实施例,可以通过由多个并发执行执行线程计算第一输入表和第二输入表的连接列的散列值来提供用于关系数据表的并行连接的系统和方法; 将所计算的散列值存储在所述第一输入表和所述第二输入表中的每一个的一组不相交的线程局部散列图中; 通过并行操作的第二多个执行线程来合并第一输入表的一组线程局部散列图,以产生一组合并的散列图; 将合并的散列映射的每个条目与第二输入表的线程局部散列映射集合的每个条目进行比较,以根据连接类型确定是否存在匹配; 以及生成包括通过比较确定的匹配的输出表。

    HASH-JOIN IN PARALLEL COMPUTATION ENVIRONMENTS
    3.
    发明申请
    HASH-JOIN IN PARALLEL COMPUTATION ENVIRONMENTS 有权
    并行计算环境中的HASH-JOIN

    公开(公告)号:US20130138628A1

    公开(公告)日:2013-05-30

    申请号:US13742034

    申请日:2013-01-15

    IPC分类号: G06F17/30

    摘要: According to some embodiments, a system and method for a parallel join of relational data tables may be provided by calculating, by a plurality of concurrently executing execution threads, hash values for join columns of a first input table and a second input table; storing the calculated hash values in a set of disjoint thread-local hash maps for each of the first input table and the second input table; merging the set of thread-local hash maps of the first input table, by a second plurality of execution threads operating concurrently, to produce a set of merged hash maps; comparing each entry of the merged hash maps to each entry of the set of thread-local hash maps for the second input table to determine whether there is a match, according to a join type; and generating an output table including matches as determined by the comparing.

    摘要翻译: 根据一些实施例,可以通过由多个并发执行执行线程计算第一输入表和第二输入表的连接列的散列值来提供用于关系数据表的并行连接的系统和方法; 将所计算的散列值存储在所述第一输入表和所述第二输入表中的每一个的一组不相交的线程局部散列图中; 通过并行操作的第二多个执行线程来合并第一输入表的一组线程局部散列图,以产生一组合并的散列图; 将合并的散列映射的每个条目与第二输入表的线程局部散列映射集合的每个条目进行比较,以根据连接类型确定是否存在匹配; 以及生成包括由所述比较确定的匹配的输出表。

    AGGREGATION IN PARALLEL COMPUTATION ENVIRONMENTS WITH SHARED MEMORY
    4.
    发明申请
    AGGREGATION IN PARALLEL COMPUTATION ENVIRONMENTS WITH SHARED MEMORY 审中-公开
    并行计算环境与共享存储器的集成

    公开(公告)号:US20120011144A1

    公开(公告)日:2012-01-12

    申请号:US12978194

    申请日:2010-12-23

    IPC分类号: G06F17/30

    摘要: According to some embodiments, a data structure may be provided by separating an input table into a plurality of partitions; generating, by each of a first plurality of execution threads operating concurrently, a local hash table for each of the threads, each local hash table storing key—index pairs; and merging the local hash tables, by a second plurality of execution threads operating concurrently, to produce a set of disjoint result hash tables. An overall result may be obtained from the result set of disjoint result hash tables. The data structure may used in a parallel computing environment to determine an aggregation.

    摘要翻译: 根据一些实施例,可以通过将输入表分成多个分区来提供数据结构; 由并行操作的第一多个执行线程中的每一个生成用于每个线程的本地散列表,每个本地散列表存储密钥索引对; 以及通过并发操作的第二多个执行线程来合并本地哈希表,以产生一组不相交的结果哈希表。 可以从不相交结果哈希表的结果集中获得总体结果。 数据结构可以在并行计算环境中使用以确定聚合。

    Parallel set aggregation
    5.
    发明授权
    Parallel set aggregation 有权
    并行设置聚合

    公开(公告)号:US09009155B2

    公开(公告)日:2015-04-14

    申请号:US13651718

    申请日:2012-10-15

    IPC分类号: G06F17/30

    摘要: A system, method and medium may provide determination of a first plurality of a plurality of data records assigned to a first processing unit, identification of a first record of the first plurality of data records, the first record associated with a first key value, generation of a first dictionary entry of a first dictionary for the first key value, storage of a first identifier of the first record as a tail identifier and as a head identifier in the first dictionary entry, storage an end flag in a first shared memory location, the first shared memory location associated with the first record, identification of a second record of the first plurality of data records, the second record associated with the first key value, replacement of the tail identifier in the first dictionary entry with a second identifier of the second record, and storage of the first identifier in a second shared memory location, the second shared memory location associated with the second record.

    摘要翻译: 系统,方法和介质可以提供分配给第一处理单元的第一多个数据记录的确定,第一多个数据记录的第一记录的识别,与第一关键值相关的第一记录,生成 对于第一键值的第一字典的第一字典条目,将第一记录的第一标识符作为尾标识符存储,并将其作为头标识符存储在第一字典条目中,将结束标志存储在第一共享存储器位置中, 与第一记录相关联的第一共享存储器位置,第一多个数据记录的第二记录的识别,与第一密钥值相关联的第二记录,用第二标识符替换第一字典条目中的尾标识符 第二记录和在第二共享存储器位置中存储第一标识符,第二共享存储器位置与第二记录相关联。

    PARALLEL SET AGGREGATION
    6.
    发明申请
    PARALLEL SET AGGREGATION 有权
    并行集合

    公开(公告)号:US20130290327A1

    公开(公告)日:2013-10-31

    申请号:US13651718

    申请日:2012-10-15

    IPC分类号: G06F17/30

    摘要: A system, method and medium may provide determination of a first plurality of a plurality of data records assigned to a first processing unit, identification of a first record of the first plurality of data records, the first record associated with a first key value, generation of a first dictionary entry of a first dictionary for the first key value, storage of a first identifier of the first record as a tail identifier and as a head identifier in the first dictionary entry, storage an end flag in a first shared memory location, the first shared memory location associated with the first record, identification of a second record of the first plurality of data records, the second record associated with the first key value, replacement of the tail identifier in the first dictionary entry with a second identifier of the second record, and storage of the first identifier in a second shared memory location, the second shared memory location associated with the second record.

    摘要翻译: 系统,方法和介质可以提供分配给第一处理单元的第一多个数据记录的确定,第一多个数据记录的第一记录的识别,与第一关键值相关的第一记录,生成 对于第一键值的第一字典的第一字典条目,将第一记录的第一标识符作为尾标识符存储,并将其作为头标识符存储在第一字典条目中,将结束标志存储在第一共享存储器位置中, 与第一记录相关联的第一共享存储器位置,第一多个数据记录的第二记录的识别,与第一密钥值相关联的第二记录,用第二标识符替换第一字典条目中的尾标识符 第二记录和在第二共享存储器位置中存储第一标识符,第二共享存储器位置与第二记录相关联。

    Lock-free generation of columns with minimal dictionaries after parallel aggregation
    7.
    发明授权
    Lock-free generation of columns with minimal dictionaries after parallel aggregation 有权
    并行聚合后无需生成具有最少字典的列

    公开(公告)号:US09569497B2

    公开(公告)日:2017-02-14

    申请号:US14301272

    申请日:2014-06-10

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30489

    摘要: A new dictionary can be created for a result column in a query plan operation executed on a database. The result column can be generated by multiple worker jobs running in parallel to read tasks from a shared queue as part of a query plan operation that includes a group-by column within an input set of input columns. The group-by column can include an original dictionary for all values contained within the group-by column If the new dictionary has fewer entries than the original dictionary for the group-by column such that mapping is required between old value identifiers within the group-by column and new value identifiers within the result column, the old value identifiers are renamed to the new value identifiers using a mapping vector.

    摘要翻译: 在数据库上执行的查询计划操作中,可以为结果列创建一个新的字典。 结果列可以由并行运行的多个作业作业生成,以从共享队列读取任务,作为查询计划操作的一部分,其中包含输入列输入集合中的逐列。 分组列可以包含一组原始字典,用于逐个列中包含的所有值如果新字典比分组的列的原始字典少,则需要在组内的旧值标识符之间进行映射, 通过列和新的值标识符在结果列中,使用映射向量将旧值标识符重命名为新的值标识符。

    LOCK-FREE GENERATION OF COLUMNS WITH MINIMAL DICTIONARIES AFTER PARALLEL AGGREGATION
    8.
    发明申请
    LOCK-FREE GENERATION OF COLUMNS WITH MINIMAL DICTIONARIES AFTER PARALLEL AGGREGATION 有权
    并行聚合后,最小化词汇的无锁生成

    公开(公告)号:US20150149401A1

    公开(公告)日:2015-05-28

    申请号:US14301272

    申请日:2014-06-10

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30489

    摘要: A new dictionary can be created for a result column in a query plan operation executed on a database. The result column can be generated by multiple worker jobs running in parallel to read tasks from a shared queue as part of a query plan operation that includes a group-by column within an input set of input columns. The group-by column can include an original dictionary for all values contained within the group-by column If the new dictionary has fewer entries than the original dictionary for the group-by column such that mapping is required between old value identifiers within the group-by column and new value identifiers within the result column, the old value identifiers are renamed to the new value identifiers using a mapping vector.

    摘要翻译: 在数据库上执行的查询计划操作中,可以为结果列创建一个新的字典。 结果列可以由并行运行的多个作业作业生成,以从共享队列读取任务,作为查询计划操作的一部分,其中包含输入列输入集合中的逐列。 分组列可以包含一组原始字典,用于逐个列中包含的所有值如果新字典比分组的列的原始字典少,则需要在组内的旧值标识符之间进行映射, 通过列和新的值标识符在结果列中,使用映射向量将旧值标识符重命名为新的值标识符。