-
公开(公告)号:US11704317B2
公开(公告)日:2023-07-18
申请号:US16797507
申请日:2020-02-21
Applicant: Oracle International Corporation
Inventor: Pit Fender , Benjamin Schlegel , Nipun Agarwal
IPC: G06F16/00 , G06F16/2453 , G06F12/12 , G06F16/2455
CPC classification number: G06F16/24542 , G06F12/12 , G06F16/24552 , G06F16/24556
Abstract: A partial group by operator is a group by operator that implements a fallback mechanism. The fallback mechanism is triggered whenever memory pressure reaches a certain threshold. When the fallback mechanism is triggered, a row is included in an output of the partial group by operator without including an aggregation value for a grouping value for the row to an aggregation data structure. A final group by operator computes a final aggregate value of all results, including pre-grouped results and passed through results, from the partial group by operator.
-
公开(公告)号:US20220284005A1
公开(公告)日:2022-09-08
申请号:US17752766
申请日:2022-05-24
Applicant: Oracle International Corporation
Inventor: Pit Fender , Felix Schmidt , Benjamin Schlegel
Abstract: Unsorted sparse dictionary encodings are transformed into unsorted-dense or sorted-dense dictionary encodings. Sparse domain codes have large gaps between codes that are adjacent in order. Unlike spare codes, dense codes have smaller gaps between adjacent codes; consecutive codes are dense codes that have no gaps between adjacent codes. The techniques described herein are relational approaches that may be used to generate sparse composite codes and sorted codes.
-
公开(公告)号:US20210271710A1
公开(公告)日:2021-09-02
申请号:US16803819
申请日:2020-02-27
Applicant: Oracle International Corporation
Inventor: Benjamin Schlegel , Martin Sevenich , Pit Fender , Matthias Brantner , Hassan Chafi
IPC: G06F16/901 , G06F9/38
Abstract: Techniques are described herein for a vectorized hash table that uses very efficient grow and insert techniques. A single-probe hash table is grown via vectorized instructions that split each bucket, of the hash table, into a respective upper and lower bucket of the expanded hash table. Further, vacant slots are indicated using a vacant-slot-indicator value, e.g., ‘0’, and all vacant slots follow to the right of all occupied slots in a bucket. A vectorized compare instruction determines whether a value is already in the bucket. If not, the vectorized compare instruction is also used to determine whether the bucket has a vacant slot based on whether the bucket contains the vacant-slot-indicator value. To insert the value into the bucket, vectorized instructions are used to shift the values in the bucket to the right by one slot and to insert the new value into the left-most slot.
-
公开(公告)号:US20210173621A1
公开(公告)日:2021-06-10
申请号:US16703499
申请日:2019-12-04
Applicant: Oracle International Corporation
Inventor: Pit Fender , Benjamin Schlegel , Matthias Brantner , Harshad Kasture , Hassan Chafi
Abstract: Herein are machine learning (ML) feature processing and analytic techniques to detect anomalies in parse trees of logic statements, database queries, logic scripts, compilation units of general-purpose programing language, extensible markup language (XML), JAVASCRIPT object notation (JSON), and document object models (DOM). In an embodiment, a computer identifies an operational trace that contains multiple parse trees. Values of explicit features are generated from a single respective parse tree of the multiple parse trees of the operational trace. Values of implicit features are generated from more than one respective parse tree of the multiple parse trees of the operational trace. The explicit and implicit features are stored into a same feature vector. With the feature vector as input, an ML model detects whether or not the operational trace is anomalous, based on the explicit features of each parse tree of the operational trace and the implicit features of multiple parse trees of the operational trace.
-
公开(公告)号:US20210157779A1
公开(公告)日:2021-05-27
申请号:US16697431
申请日:2019-11-27
Applicant: Oracle International Corporation
Inventor: Pit Fender , Benjamin Schlegel , Matthias Brantner
IPC: G06F16/22 , G06F16/2455 , G06F16/242 , G06F17/27
Abstract: Techniques described herein propose a new RIDDecode operator in a QEP that uses ROWID lookup and fetch, instead of dictionary decoding, to retrieve decoded values, in order to reduce memory pressure and speed up processing.
-
公开(公告)号:US10810195B2
公开(公告)日:2020-10-20
申请号:US15861212
申请日:2018-01-03
Applicant: Oracle International Corporation
Inventor: Anantha Kiran Kandukuri , Seema Sundara , Sam Idicula , Pit Fender , Nitin Kunal , Sabina Petride , Georgios Giannikis , Nipun Agarwal
IPC: G06F17/00 , G06F16/2453 , H04L29/08 , G06F16/22 , G06F16/174 , G06F40/242 , G06F16/23
Abstract: Techniques related to distributed relational dictionaries are disclosed. In some embodiments, one or more non-transitory storage media store a sequence of instructions which, when executed by one or more computing devices, cause performance of a method. The method involves generating, by a query optimizer at a distributed database system (DDS), a query execution plan (QEP) for generating a code dictionary and a column of encoded database data. The QEP specifies a sequence of operations for generating the code dictionary. The code dictionary is a database table. The method further involves receiving, at the DDS, a column of unencoded database data from a data source that is external to the DDS. The DDS generates the code dictionary according to the QEP. Furthermore, based on joining the column of unencoded database data with the code dictionary, the DDS generates the column of encoded database data according to the QEP.
-
公开(公告)号:US20200293332A1
公开(公告)日:2020-09-17
申请号:US16299483
申请日:2019-03-12
Applicant: Oracle International Corporation
Inventor: Benjamin Schlegel , Pit Fender , Harshad Kasture , Matthias Brantner , Hassan Chafi
Abstract: Techniques are provided for vectorizing Heapsort. A K-heap is used as the underlying data structure for indexing values being sorted. The K-heap is vectorized by storing values in a contiguous memory array containing a beginning-most side and end-most side. The vectorized Heapsort utilizes horizontal aggregation SIMD instructions for comparisons, shuffling, and moving data. Thus, the number of comparisons required in order to find the maximum or minimum key value within a single node of the K-heap is reduced resulting in faster retrieval operations.
-
公开(公告)号:US20190155930A1
公开(公告)日:2019-05-23
申请号:US15819193
申请日:2017-11-21
Applicant: Oracle International Corporation
Inventor: Pit Fender , Seema Sundara , Benjamin Schlegel , Nipun Agarwal
IPC: G06F17/30
Abstract: Techniques related to relational dictionaries are disclosed. In some embodiments, one or more non-transitory storage media store a sequence of instructions which, when executed by one or more computing devices, cause performance of a method. The method involves storing a code dictionary comprising a set of tuples. The code dictionary is a database table defined by a database dictionary and comprises columns that are each defined by the database dictionary. The set of tuples maps a set of codes to a set of tokens. The set of tokens are stored in a column of unencoded database data. The method further involves generating encoded database data based on joining the unencoded database data with the set of tuples. Furthermore, the method involves generating decoding database data based on joining the encoded database data with the set of tuples.
-
公开(公告)号:US11379232B2
公开(公告)日:2022-07-05
申请号:US16399226
申请日:2019-04-30
Applicant: Oracle International Corporation
Inventor: Benjamin Schlegel , Harshard Kasture , Pit Fender , Matthias Brantner , Hassan Chafi
IPC: G06F9/30 , G06F9/38 , G06F16/901
Abstract: Techniques are provided for obtaining generic vectorized d-heaps for any data type for which horizontal aggregation SIMD instructions are not available, including primitive as well as complex data types. A generic vectorized d-heap comprises a prefix heap and a plurality of suffix heaps. Each suffix heap of the plurality of suffix heaps comprises a d-heap. A plurality of key values stored in the heap are split into key prefix values and key suffix values. Key prefix values are stored in the prefix heap and key suffix values are stored in the plurality of suffix heaps. Each entry in the prefix heap includes a key prefix value of the plurality of key values and a reference to the suffix heap of the plurality of suffix heaps that includes all key suffix values of the plurality of key values that share the respective key prefix value.
-
公开(公告)号:US20210390089A1
公开(公告)日:2021-12-16
申请号:US17459447
申请日:2021-08-27
Applicant: Oracle International Corporation
Inventor: Pit Fender , Felix Schmidt , Benjamin Schlegel , Matthias Brantner , Nipun Agarwal
Abstract: Techniques related to code dictionary generation based on non-blocking operations are disclosed. In some embodiments, a column of tokens includes a first token and a second token that are stored in separate rows. The column of tokens is correlated with a set of row identifiers including a first row identifier and a second row identifier that is different from the first row identifier. Correlating the column of tokens with the set of row identifiers involves: storing a correlation between the first token and the first row identifier, storing a correlation between the second token and the second row identifier if the first token and the second token have different values, and storing a correlation between the second token and the first row identifier if the first token and the second token have identical values. After correlating the column of tokens with the set of row identifiers, duplicate correlations are removed.
-
-
-
-
-
-
-
-
-