-
公开(公告)号:US11016778B2
公开(公告)日:2021-05-25
申请号:US16299483
申请日:2019-03-12
Applicant: Oracle International Corporation
Inventor: Benjamin Schlegel , Pit Fender , Harshad Kasture , Matthias Brantner , Hassan Chafi
Abstract: Techniques are provided for vectorizing Heapsort. A K-heap is used as the underlying data structure for indexing values being sorted. The K-heap is vectorized by storing values in a contiguous memory array containing a beginning-most side and end-most side. The vectorized Heapsort utilizes horizontal aggregation SIMD instructions for comparisons, shuffling, and moving data. Thus, the number of comparisons required in order to find the maximum or minimum key value within a single node of the K-heap is reduced resulting in faster retrieval operations.
-
公开(公告)号:US20200348933A1
公开(公告)日:2020-11-05
申请号:US16399226
申请日:2019-04-30
Applicant: Oracle International Corporation
Inventor: Benjamin Schlegel , Harshard Kasture , Pit Fender , Matthias Brantner , Hassan Chafi
Abstract: Techniques are provided for obtaining generic vectorized d-heaps for any data type for which horizontal aggregation SIMD instructions are not available, including primitive as well as complex data types. A generic vectorized d-heap comprises a prefix heap and a plurality of suffix heaps. Each suffix heap of the plurality of suffix heaps comprises a d-heap. A plurality of key values stored in the heap are split into key prefix values and key suffix values. Key prefix values are stored in the prefix heap and key suffix values are stored in the plurality of suffix heaps. Each entry in the prefix heap includes a key prefix value of the plurality of key values and a reference to the suffix heap of the plurality of suffix heaps that includes all key suffix values of the plurality of key values that share the respective key prefix value.
-
公开(公告)号:US10684873B2
公开(公告)日:2020-06-16
申请号:US16006668
申请日:2018-06-12
Applicant: Oracle International Corporation
Inventor: Bastian Hossbach , Jürgen Christ , Laurent Daynes , Matthias Brantner , Hassan Chafi , Christian Humer
IPC: G06F9/455 , G06F16/245 , G06F16/38
Abstract: Computer-implemented techniques described herein provide efficient data decoding using runtime specialization. In an embodiment, a method comprises a virtual machine executing a body of code of a dynamically typed language, wherein executing the body of code includes: querying a relational database, and in response to the query, receiving table metadata indicating data types of one or more columns of a first table in the relational database. In response to receiving the table metadata: for a first column of the one or more columns, generating decoding machine code to decode the first column based on the data type of the first column, and executing the decoding machine code to decode the first column of the one or more columns.
-
公开(公告)号:US11816102B2
公开(公告)日:2023-11-14
申请号:US16991888
申请日:2020-08-12
Applicant: ORACLE INTERNATIONAL CORPORATION
Inventor: Alberto Parravicini , Jinha Kim , Sungpack Hong , Matthias Brantner , Hassan Chafi
IPC: G06F16/2452 , G06F16/22 , G06F16/248 , G06F16/28 , G06N5/025 , G06N5/048
CPC classification number: G06F16/24522 , G06F16/2246 , G06F16/248 , G06F16/288 , G06N5/025 , G06N5/048
Abstract: Techniques described herein allow for accurate translation of natural language (NL) queries to declarative language. A syntactic dependency parsing tree is generated for an NL query, which is used to map tokens in the query to logical data model concepts. Relationship-type mappings are completed based on relationship constraints. Final mappings are identified for any relationship tokens that are associated with multiple candidate mappings by identifying which candidate mappings have the lowest cost metrics. An NL query-specific query graph is generated based on the mapping data for the NL query and the logical data model. The query graph represents an NL query-specific version of the logical data model where grammatical dependencies between NL query words are translated to the query graph. A query graph is annotated with information, from the mapping data, that is not represented paths in the query graph. The query graph is used generate a computer-executable translation of the NL query.
-
公开(公告)号:US11379232B2
公开(公告)日:2022-07-05
申请号:US16399226
申请日:2019-04-30
Applicant: Oracle International Corporation
Inventor: Benjamin Schlegel , Harshard Kasture , Pit Fender , Matthias Brantner , Hassan Chafi
IPC: G06F9/30 , G06F9/38 , G06F16/901
Abstract: Techniques are provided for obtaining generic vectorized d-heaps for any data type for which horizontal aggregation SIMD instructions are not available, including primitive as well as complex data types. A generic vectorized d-heap comprises a prefix heap and a plurality of suffix heaps. Each suffix heap of the plurality of suffix heaps comprises a d-heap. A plurality of key values stored in the heap are split into key prefix values and key suffix values. Key prefix values are stored in the prefix heap and key suffix values are stored in the plurality of suffix heaps. Each entry in the prefix heap includes a key prefix value of the plurality of key values and a reference to the suffix heap of the plurality of suffix heaps that includes all key suffix values of the plurality of key values that share the respective key prefix value.
-
公开(公告)号:US20210390089A1
公开(公告)日:2021-12-16
申请号:US17459447
申请日:2021-08-27
Applicant: Oracle International Corporation
Inventor: Pit Fender , Felix Schmidt , Benjamin Schlegel , Matthias Brantner , Nipun Agarwal
Abstract: Techniques related to code dictionary generation based on non-blocking operations are disclosed. In some embodiments, a column of tokens includes a first token and a second token that are stored in separate rows. The column of tokens is correlated with a set of row identifiers including a first row identifier and a second row identifier that is different from the first row identifier. Correlating the column of tokens with the set of row identifiers involves: storing a correlation between the first token and the first row identifier, storing a correlation between the second token and the second row identifier if the first token and the second token have different values, and storing a correlation between the second token and the first row identifier if the first token and the second token have identical values. After correlating the column of tokens with the set of row identifiers, duplicate correlations are removed.
-
公开(公告)号:US11200234B2
公开(公告)日:2021-12-14
申请号:US16441989
申请日:2019-06-14
Applicant: Oracle International Corporation
Inventor: Pit Fender , Benjamin Schlegel , Matthias Brantner
IPC: G06F16/2453 , G06F16/21 , G06F16/28
Abstract: Approaches herein transparently delegate data access from a relational database management system (RDBMS) onto an offload engine (OE). The RDBMS receives a database statement referencing a user defined function (UDF). In an execution plan, the RDBMS replaces the UDF reference with an invocation of a relational operator in the OE. Execution invokes the relational operator in the OE to obtain a result based on data in the OE. Thus, the UDF is bound to the OE, and almost all of the RDBMS avoids specially handling the UDF. The UDF may be a table function that offloads a relational table for processing. User defined objects such as functions and types provide metadata about the table. Multiple tables can be offloaded and processed together, such that some or all offloaded tables are not materialized in the RDBMS. Offloaded tables may participate in standard relational algebra such as in a database statement.
-
公开(公告)号:US11169804B2
公开(公告)日:2021-11-09
申请号:US16139226
申请日:2018-09-24
Applicant: Oracle International Corporation
Inventor: Benjamin Schlegel , Harshad Kasture , Pit Fender , Matthias Brantner , Hassan Chafi
Abstract: Techniques for maintaining d-heap property and speeding up retrieval operations, such as top or pop, by vectorizing the d-heap and utilizing horizontal aggregation SIMD instructions across the retrieval operations. A d-heap is vectorized by storing it in a contiguous memory array containing a beginning-most side and end-most side. Horizontal aggregation SIMD instructions are utilized to aggregate the values of the vectorized d-heap. Thus, the number of comparisons required in order to find the maximum or minimum key value within a single node of the d-heap is reduced resulting in faster retrieval operations.
-
公开(公告)号:US20210318886A1
公开(公告)日:2021-10-14
申请号:US16846773
申请日:2020-04-13
Applicant: Oracle International Corporation
Inventor: Benjamin Schlegel , Pit Fender , Matthias Brantner , Hassan Chafi
Abstract: Vectorized sorted-set intersection is performed using conflict-detection single instruction, multiple data (SIMD) instructions. A first ordered subset of values of a first ordered set of distinct values and a second ordered subset of values of a second ordered set of distinct values is loaded into a register. A first value in the register that matches another value in the register (i.e., common values) is identified by performing an SIMD instruction. The first value is then stored in a result set representing a merge-sort result set between the first ordered set of distinct values and the second ordered set of distinct values.
-
公开(公告)号:US20210294603A1
公开(公告)日:2021-09-23
申请号:US16822009
申请日:2020-03-18
Applicant: Oracle International Corporation
Inventor: Harshad Kasture , Matthias Brantner , Hassan Chafi , Benjamin Schlegel , Pit Fender
Abstract: Techniques are provided for lazy push optimization, allowing for constant time push operations. A d-heap is used as the underlying data structure for indexing values being inserted. The d-heap is vectorized by storing values in a contiguous memory array. Heapify operations are delayed until a retrieve operation occurs, improving insert performance of vectorized d-heaps that use horizontal aggregation SIMD instructions at the cost of slightly lower retrieve performance.
-
-
-
-
-
-
-
-
-