-
公开(公告)号:US20180336230A1
公开(公告)日:2018-11-22
申请号:US15596954
申请日:2017-05-16
Applicant: SAP SE
Inventor: Frederik Transier , Kai Stammerjohann , Nico Bohnsack
IPC: G06F17/30
Abstract: In one respect, there is provided a method. The method can include processing a first data chunk to generate a first intermediate result. A key map can be generated based on a determination that a quantity of the key-value pairs in the first intermediate result exceeds a threshold. The key map can be generated to include keys in the first intermediate result. A second data chunk can be processed to generate a second intermediate result. The second data chunk can be processed based on the key map. The processing of the second data chunk can include omitting a key-value pair in the second data chunk from being inserted into the second intermediate result based on a key associated with the key-value pair being absent from the key map. A preview of the processing of the dataset can be generated based on the first intermediate result and the second intermediate result.
-
公开(公告)号:US09177025B2
公开(公告)日:2015-11-03
申请号:US13742034
申请日:2013-01-15
Applicant: SAP SE
Inventor: Christian Bensberg , Christian Mathis , Frederik Transier , Nico Bohnsack , Kai Stammerjohann
CPC classification number: G06F17/30466 , G06F17/3033 , G06F17/30445
Abstract: According to some embodiments, a system and method for a parallel join of relational data tables may be provided by calculating, by a plurality of concurrently executing execution threads, hash values for join columns of a first input table and a second input table; storing the calculated hash values in a set of disjoint thread-local hash maps for each of the first input table and the second input table; merging the set of thread-local hash maps of the first input table, by a second plurality of execution threads operating concurrently, to produce a set of merged hash maps; comparing each entry of the merged hash maps to each entry of the set of thread-local hash maps for the second input table to determine whether there is a match, according to a join type; and generating an output table including matches as determined by the comparing.
Abstract translation: 根据一些实施例,可以通过由多个并发执行执行线程计算第一输入表和第二输入表的连接列的散列值来提供用于关系数据表的并行连接的系统和方法; 将所计算的散列值存储在所述第一输入表和所述第二输入表中的每一个的一组不相交的线程局部散列图中; 通过并行操作的第二多个执行线程来合并第一输入表的一组线程局部散列图,以产生一组合并的散列图; 将合并的散列映射的每个条目与第二输入表的线程局部散列映射集合的每个条目进行比较,以根据连接类型确定是否存在匹配; 以及生成包括通过比较确定的匹配的输出表。
-
公开(公告)号:US11556532B2
公开(公告)日:2023-01-17
申请号:US16366176
申请日:2019-03-27
Applicant: SAP SE
Inventor: Nico Bohnsack , Dennis Felsing , Arnaud Lacurie , Wolfgang Stephan
IPC: G06F16/2453 , G06F16/22 , G06F16/23
Abstract: A method may include inserting, into a hash trie, data records from a database table. The inserting may include traversing the hash trie to identify, for each data record included in the database table, a corresponding node at which to insert the data record. The hash trie may be traversed based on a hash of a key value associated with each data record. The node at which to insert a data record may be identified based on an offset forming a binary representation of the hash of a key value associated with that data record. The offset may include a portion of a plurality of binary digits forming the binary representation. A data record may be inserted at a corresponding node by updating a data structure included at the node. A database operation may be performed based on the hash trie filled with the data records from the database table.
-
公开(公告)号:US20170147393A1
公开(公告)日:2017-05-25
申请号:US14947689
申请日:2015-11-20
Applicant: SAP SE
Inventor: Kai Stammerjohann , Nico Bohnsack , Frederik Transier
Abstract: A system provides determination of a first plurality of the plurality of data records assigned to a first processing unit, identification of a first record of the first plurality of data records, the first record associated with a first key value, determination of a first partition based on the first key value, allocation of a first memory block associated with the first partition, the first memory block comprising a first two or more memory locations, generation of a mapping between the first record and a first one of the first two or more memory locations, identification of a second record of the first plurality of data records, the second record associated with a second key value, determination of the first partition based on the second key value, and generation of a mapping between the second record and a second one of the first two or more memory locations.
-
公开(公告)号:US20200311075A1
公开(公告)日:2020-10-01
申请号:US16366176
申请日:2019-03-27
Applicant: SAP SE
Inventor: Nico Bohnsack , Dennis Felsing , Arnaud Lacurie , Wolfgang Stephan
IPC: G06F16/2453 , G06F16/22 , G06F16/23
Abstract: A method may include inserting, into a hash trie, data records from a database table. The inserting may include traversing the hash trie to identify, for each data record included in the database table, a corresponding node at which to insert the data record. The hash trie may be traversed based on a hash of a key value associated with each data record. The node at which to insert a data record may be identified based on an offset forming a binary representation of the hash of a key value associated with that data record. The offset may include a portion of a plurality of binary digits forming the binary representation. A data record may be inserted at a corresponding node by updating a data structure included at the node. A database operation may be performed based on the hash trie filled with the data records from the database table.
-
-
-
-