-
公开(公告)号:US11423022B2
公开(公告)日:2022-08-23
申请号:US16016966
申请日:2018-06-25
Applicant: Oracle International Corporation
Inventor: Jian Wen , Sam Idicula , Nitin Kunal , Farhan Tauheed , Seema Sundara , Nipun Agarwal , Indu Bhagat
IPC: G06F16/24 , G06F9/50 , G06F16/2453 , G06F16/22
Abstract: Techniques are described herein for building a framework for declarative query compilation using both rule-based and cost-based approaches for database management. The framework involves constructing and using: a set of rule-based properties tables that contain optimization parameters for both logical and physical optimization, a recursive algorithm to form candidate physical query plans that is based on the rule based tables, and a cost model for estimating the cost of a generated physical query plan that is used with the rule based properties tables to prune inferior query plans.
-
公开(公告)号:US11256698B2
公开(公告)日:2022-02-22
申请号:US16382085
申请日:2019-04-11
Applicant: Oracle International Corporation
Inventor: Sam Idicula , Tomas Karnagel , Jian Wen , Seema Sundara , Nipun Agarwal , Mayur Bency
IPC: G06F16/2453 , G06F16/21 , G06N20/00 , G06N20/20
Abstract: Embodiments utilize trained query performance machine learning (QP-ML) models to predict an optimal compute node cluster size for a given in-memory workload. The QP-ML models include models that predict query task runtimes at various compute node cardinalities, and models that predict network communication time between nodes of the cluster. Embodiments also utilize an analytical model to predict overlap between predicted task runtimes and predicted network communication times. Based on this data, an optimal cluster size is selected for the workload. Embodiments further utilize trained data capacity machine learning (DC-ML) models to predict a minimum number of compute nodes needed to run a workload. The DC-ML models include models that predict the size of the workload dataset in a target data encoding, models that predict the amount of memory needed to run the queries in the workload, and models that predict the memory needed to accommodate changes to the dataset.
-
公开(公告)号:US11176487B2
公开(公告)日:2021-11-16
申请号:US15885515
申请日:2018-01-31
Applicant: Oracle International Corporation
Inventor: Venkatanathan Varadarajan , Sam Idicula , Sandeep Agrawal , Nipun Agarwal
Abstract: Herein, horizontally scalable techniques efficiently configure machine learning algorithms for optimal accuracy and without informed inputs. In an embodiment, for each particular hyperparameter, and for each epoch, a computer processes the particular hyperparameter. An epoch explores one hyperparameter based on hyperparameter tuples. A respective score is calculated from each tuple. The tuple contains a distinct combination of values, each of which is contained in a value range of a distinct hyperparameter. All values of a tuple that belong to the particular hyperparameter are distinct. All values of a tuple that belong to other hyperparameters are held constant. The value range of the particular hyperparameter is narrowed based on an intersection point of a first line based on the scores and a second line based on the scores. A machine learning algorithm is optimally configured from repeatedly narrowed value ranges of hyperparameters. The configured algorithm is invoked to obtain a result.
-
公开(公告)号:US11163800B2
公开(公告)日:2021-11-02
申请号:US16541605
申请日:2019-08-15
Applicant: Oracle International Corporation
Inventor: Negar Koochakzadeh , Nitin Kunal , Sam Idicula , Cagri Balkesen , Nipun Agarwal
Abstract: Techniques for non-power-of-two partitioning of a data set as well as generation and selection of partition schemes for the data set. In an embodiment, one or more iterations of a partition scheme is for a non-power-of-two number of partitions. Extended hash partitioning may be used to partition a data set into a non-power-of-two number of partitions by determining the partition identifier of each tuple of the data set using the extended hash partitioning algorithm. In an embodiment, multiple partition schemes are generated for multiple data sets, based on properties of the data sets and/or availability of computing resources for the partition operation or the subsequent operation to the partition operation. The generated partition schemes may use non-power-of-two partitioning for one or more iterations of a generated partition scheme. The most optimal partition scheme may be selected from the generated partition schemes based on optimization policies.
-
公开(公告)号:US11138291B2
公开(公告)日:2021-10-05
申请号:US15716225
申请日:2017-09-26
Applicant: Oracle International Corporation
Inventor: Gaurav Chadha , Sam Idicula , Sandeep Agrawal , Nipun Agarwal
Abstract: Techniques are described herein for performing efficient matrix multiplication in architectures with scratchpad memories or associative caches using asymmetric allocation of space for the different matrices. The system receives a left matrix and a right matrix. In an embodiment, the system allocates, in a scratchpad memory, asymmetric memory space for tiles for each of the two matrices as well as a dot product matrix. The system proceeds with then performing dot product matrix multiplication involving the tiles of the left and the right matrices, storing resulting dot product values in corresponding allocated dot product matrix tiles. The system then proceeds to write the stored dot product values from the scratchpad memory into main memory.
-
6.
公开(公告)号:US11126626B2
公开(公告)日:2021-09-21
申请号:US16272829
申请日:2019-02-11
Applicant: Oracle International Corporation
Inventor: Sabina Petride , Sam Idicula , Nipun Agarwal
IPC: G06F16/2455
Abstract: A system and method for processing a group and aggregate query on a relation are disclosed. A database system determines whether assistance of a heterogeneous system (HS) of compute nodes is beneficial in performing the query. Assuming that the relation has been partitioned and loaded into the HS, the database system determines, in a compile phase, whether the HS has the functional capabilities to assist, and whether the cost and benefit favor performing the operation with the assistance of the HS. If the cost and benefit favor using the assistance of the HS, then the system enters the execution phase. The database system starts, in the execution phase, an optimal number of parallel processes to produce and consume the results from the compute nodes of the HS. After any needed transaction consistency checks, the results of the query are returned by the database system.
-
公开(公告)号:US20210263934A1
公开(公告)日:2021-08-26
申请号:US17318972
申请日:2021-05-12
Applicant: Oracle International Corporation
Inventor: Sam Idicula , Tomas Karnagel , Jian Wen , Seema Sundara , Nipun Agarwal , Mayur Bency
IPC: G06F16/2453 , G06N20/00 , G06F16/21 , G06N20/20
Abstract: Embodiments implement a prediction-driven, rather than a trial-driven, approach to automate database configuration parameter tuning for a database workload. This approach uses machine learning (ML) models to test performance metrics resulting from application of particular database parameters to a database workload, and does not require live trials on the DBMS managing the workload. Specifically, automatic configuration (AC) ML models are trained, using a training corpus that includes information from workloads being run by DBMSs, to predict performance metrics based on workload features and configuration parameter values. The trained AC-ML models predict performance metrics resulting from applying particular configuration parameter values to a given database workload being automatically tuned. Based on correlating changes to configuration parameter values with changes in predicted performance metrics, an optimization algorithm is used to converge to an optimal set of configuration parameters. The optimal set of configuration parameter values is automatically applied for the given workload.
-
公开(公告)号:US10956417B2
公开(公告)日:2021-03-23
申请号:US15581984
申请日:2017-04-28
Applicant: Oracle International Corporation
Inventor: Jarod Wen , Sam Idicula , Nitin Kunal , Thomas Chang , Gong Zhang , Nipun Agarwal , Farhan Tauheed
IPC: G06F16/2453
Abstract: Techniques are provided for scheduling data operations for a given query based upon a query-cost model that analyzes the cost of scheduling data operations based upon their operation cost and the type of resources needed for the operation. In an embodiment, a database server receives a set of operations for a query. The database server determines a set of leaf operation nodes from the set of data operations, where the set of leaf operation nodes includes operation nodes that do not depend on the execution of other nodes within the set of data operations. The database server compares operation costs between the leaf operation nodes to determine which leaf operation node to insert into a scheduled order set. The database server inserts the leaf operation node into the scheduled order set. Then the database server iteratively determines new leaf operation nodes and performs cost analysis on remaining leaf operation nodes to generate a set of scheduled data operations.
-
公开(公告)号:US10862755B2
公开(公告)日:2020-12-08
申请号:US15639265
申请日:2017-06-30
Applicant: Oracle International Corporation
Inventor: Aarti Basant , Sam Idicula , Nipun Agarwal
IPC: H04L12/24 , G06F15/173 , H04L12/721 , H04L12/933 , H04L12/931 , H04L12/801 , H04L29/08
Abstract: Techniques herein partition data using data repartitioning that is store-and-forward, content-based, and phasic. In embodiments, computer(s) maps network elements (NEs) to grid points (GPs) in a multidimensional hyperrectangle. Each NE contains data items (DIs). For each particular dimension (PD) of the hyperrectangle the computers perform, for each particular NE (PNE), various activities including: determining a linear subset (LS) of NEs that are mapped to GPs in the hyperrectangle at a same position as the GP of the PNE along all dimensions of the hyperrectangle except the PD, and data repartitioning that includes, for each DI of the PNE, the following activities. The PNE determines a bit sequence based on the DI. The PNE selects, based on the PD, a bit subset of the bit sequence. The PNE selects, based on the bit subset, a receiving NE of the LS. The PNE sends the DI to the receiving NE.
-
公开(公告)号:US20200065215A1
公开(公告)日:2020-02-27
申请号:US16670681
申请日:2019-10-31
Applicant: Oracle International Corporation
Inventor: Sam Idicula , Kirtikar Kashyap , Arun Raghavan , Evangelos Vlachos , Venkatraman Govindaraju
Abstract: Techniques are provided for redundant execution by a better processor for intensive dynamic profiling after initial execution by a constrained processor. In an embodiment, a system of computer(s) receives a request to profile particular runtime aspects of an original binary executable. Based on the particular runtime aspects and without accessing source logic, the system statically rewrites the original binary executable into a rewritten binary executable that invokes telemetry instrumentation that makes observations of the particular runtime aspects and emits traces of those observations. A first processing core having low power (capacity) performs a first execution of the rewritten binary executable to make first observations and emit first traces of the first observations. Afterwards, a second processing core performs a second (redundant) execution of the original binary executable based on the first traces. The second execution generates a detailed dynamic performance profile based on the second execution.
-
-
-
-
-
-
-
-
-