Patent search ap:("Oracle International Corporation") AND inv:"Nipun Agarwal" Page 1

1.

发明授权
Enabling efficient machine learning model inference using adaptive sampling for autonomous database services 有权

公开(公告)号：US12014286B2

公开(公告)日：2024-06-18

申请号：US16914816

申请日：2020-06-29

Applicant: Oracle International Corporation

Inventor： Farhan Tauheed , Onur Kocberber , Tomas Karnagel , Nipun Agarwal

IPC: G06N5/04 , G06F16/22 , G06N20/00

CPC classification number: G06N5/04 , G06F16/2282 , G06N20/00

Abstract: Herein are approaches for self-optimization of a database management system (DBMS) such as in real time. Adaptive just-in-time sampling techniques herein estimate database content statistics that a machine learning (ML) model may use to predict configuration settings that conserve computer resources such as execution time and storage space. In an embodiment, a computer repeatedly samples database content until a dynamic convergence criterion is satisfied. In each iteration of a series of sampling iterations, a subset of rows of a database table are sampled, and estimates of content statistics of the database table are adjusted based on the sampled subset of rows. Immediately or eventually after detecting dynamic convergence, a machine learning (ML) model predicts, based on the content statistic estimates, an optimal value for a configuration setting of the DBMS.

2.

发明公开
ADAPTIVE SAMPLING TO COMPUTE GLOBAL FEATURE EXPLANATIONS WITH SHAPLEY VALUES 审中-公开

公开(公告)号：US20240086763A1

公开(公告)日：2024-03-14

申请号：US17944949

申请日：2022-09-14

Applicant: Oracle International Corporation

Inventor： Jeremy Plassmann , Anatoly Yakovlev , Sandeep R. Agrawal , Ali Moharrer , Sanjay Jinturkar , Nipun Agarwal

IPC: G06N20/00 , G06N5/04

CPC classification number: G06N20/00 , G06N5/042

Abstract: Techniques for computing global feature explanations using adaptive sampling are provided. In one technique, first and second samples from an dataset are identified. A first set of feature importance values (FIVs) is generated based on the first sample and a machine-learned model. A second set of FIVs is generated based on the second sample and the model. If a result of a comparison between the first and second FIV sets does not satisfy criteria, then: (i) an aggregated set is generated based on the last two FIV sets; (ii) a new sample that is double the size of a previous sample is identified from the dataset; (iii) a current FIV set is generated based on the new sample and the model; (iv) determine whether a result of a comparison between the current and aggregated FIV sets satisfies criteria; repeating (i)-(iv) until the result of the last comparison satisfies the criteria.

3.

发明授权
Prediction of buffer pool size for transaction processing workloads 有权

公开(公告)号：US11868261B2

公开(公告)日：2024-01-09

申请号：US17381072

申请日：2021-07-20

Applicant: Oracle International Corporation

Inventor： Peyman Faizian , Mayur Bency , Onur Kocberber , Seema Sundara , Nipun Agarwal

IPC: G06F16/2455 , G06F12/0842

CPC classification number: G06F12/0842 , G06F16/24552 , G06F2212/6022

Abstract: Techniques are described herein for prediction of an buffer pool size (BPS). Before performing BPS prediction, gathered data are used to determine whether a target workload is in a steady state. Historical utilization data gathered while the workload is in a steady state are used to predict object-specific BPS components for database objects, accessed by the target workload, that are identified for BPS analysis based on shares of the total disk I/O requests, for the workload, that are attributed to the respective objects. Preference of analysis is given to objects that are associated with larger shares of disk I/O activity. An object-specific BPS component is determined based on a coverage function that returns a percentage of the database object size (on disk) that should be available in the buffer pool for that database object. The percentage is determined using either a heuristic-based or a machine learning-based approach.

4.

发明授权
Mini-machine learning 有权

公开(公告)号：US11790242B2

公开(公告)日：2023-10-17

申请号：US16166039

申请日：2018-10-19

Applicant: Oracle International Corporation

Inventor： Sandeep Agrawal , Venkatanathan Varadarajan , Sam Idicula , Nipun Agarwal

IPC: G06N3/126 , G06N20/00

CPC classification number: G06N3/126 , G06N20/00

Abstract: Techniques are described for generating and applying mini-machine learning variants of machine learning algorithms to save computational resources in tuning and selection of machine learning algorithms. In an embodiment, at least one of the hyper-parameter values for a reference variant is modified to a new hyper-parameter value thereby generating a new variant of machine learning algorithm from the reference variant of machine learning algorithm. A performance score is determined for the new variant of machine learning algorithm using a training dataset, the performance score representing the accuracy of the new machine learning model for the training dataset. By performing training of the new variant of machine learning algorithm with the training data set, a cost metric of the new variant of machine learning algorithm is measured by measuring usage the used computing resources for the training. Based on the cost metric of the new variant of machine learning algorithm and comparing the performance score for the new and reference variants, the system determines whether the modified reference machine algorithm is the mini-machine learning algorithm that is computationally less costly than the reference variant of machine learning algorithm but closely tracks the accuracy thereof.

5.

发明授权
Automatic feature subset selection based on meta-learning 有权

公开(公告)号：US11615265B2

公开(公告)日：2023-03-28

申请号：US16547312

申请日：2019-08-21

Applicant: Oracle International Corporation

Inventor： Tomas Karnagel , Sam Idicula , Hesam Fathi Moghadam , Nipun Agarwal

IPC: G06F16/00 , G06K9/62 , G06N20/00

Abstract: The present invention relates to dimensionality reduction for machine learning (ML) models. Herein are techniques that individually rank features and combine features based on their rank to achieve an optimal combination of features that may accelerate training and/or inferencing, prevent overfitting, and/or provide insights into somewhat mysterious datasets. In an embodiment, a computer ranks features of datasets of a training corpus. For each dataset and for each landmark percentage, a target ML model is configured to receive only a highest ranking landmark percentage of features, and a landmark accuracy achieved by training the ML model with the dataset is measured. Based on the landmark accuracies and meta-features values of the dataset, a respective training tuple is generated for each dataset. Based on all of the training tuples, a regressor is trained to predict an optimal amount of features for training the target ML model.

6.

发明授权
Automated configuration parameter tuning for database performance 有权

公开(公告)号：US11567937B2

公开(公告)日：2023-01-31

申请号：US17318972

申请日：2021-05-12

Applicant: Oracle International Corporation

Inventor： Sam Idicula , Tomas Karnagel , Jian Wen , Seema Sundara , Nipun Agarwal , Mayur Bency

IPC: G06F16/2453 , G06N20/00 , G06F16/21 , G06N20/20

Abstract: Embodiments implement a prediction-driven, rather than a trial-driven, approach to automate database configuration parameter tuning for a database workload. This approach uses machine learning (ML) models to test performance metrics resulting from application of particular database parameters to a database workload, and does not require live trials on the DBMS managing the workload. Specifically, automatic configuration (AC) ML models are trained, using a training corpus that includes information from workloads being run by DBMSs, to predict performance metrics based on workload features and configuration parameter values. The trained AC-ML models predict performance metrics resulting from applying particular configuration parameter values to a given database workload being automatically tuned. Based on correlating changes to configuration parameter values with changes in predicted performance metrics, an optimization algorithm is used to converge to an optimal set of configuration parameters. The optimal set of configuration parameter values is automatically applied for the given workload.

7.

发明申请
LOCAL PERMUTATION IMPORTANCE: A STABLE, LINEAR-TIME LOCAL MACHINE LEARNING FEATURE ATTRIBUTOR 有权

公开(公告)号：US20220366297A1

公开(公告)日：2022-11-17

申请号：US17319729

申请日：2021-05-13

Applicant: Oracle International Corporation

Inventor： Yasha Pushak , Zahra Zohrevand , Tayler Hetherington , Karoon Rashedi Nia , Sanjay Jinturkar , Nipun Agarwal

IPC: G06N20/00 , G06N5/04 , G06K9/62

Abstract: In an embodiment, a computer hosts a machine learning (ML) model that infers a particular inference for a particular tuple that is based on many features. For each feature, and for each of many original tuples, the computer: a) randomly selects many perturbed values from original values of the feature in the original tuples, b) generates perturbed tuples that are based on the original tuple and a respective perturbed value, c) causes the ML model to infer a respective perturbed inference for each perturbed tuple, and d) measures a respective difference between each perturbed inference of the perturbed tuples and the particular inference. For each feature, a respective importance of the feature is calculated based on the differences measured for the feature. Feature importances may be used to rank features by influence and/or generate a local ML explainability (MLX) explanation.

8.

发明申请
EFFICIENT ADJUSTMENT OF SPIN-LOCKING PARAMETER VALUES 有权

公开(公告)号：US20220107933A1

公开(公告)日：2022-04-07

申请号：US17060999

申请日：2020-10-01

Applicant: Oracle International Corporation

Inventor： Onur Kocberber , Mayur Bency , Marc Jolles , Seema Sundara , Nipun Agarwal

IPC: G06F16/23 , G06F16/245

Abstract: Systems and methods for adjusting parameters for a spin-lock implementation of concurrency control are described herein. In an embodiment, a system continuously retrieves, from a resource management system, one or more state values defining a state of the resource management system. Based on the one or more state values, the system determines that the resource management system has reached a steady state and, in response adjusts a plurality of parameters for spin-locking performed by said resource management system to identify optimal values for the plurality of parameters. After adjusting the plurality of parameters, the system detects, based on one or more current state values, a workload change in the resource management system and, in response, readjusts the plurality of parameters for spin-locking performed by said resource management system to identify new optimal values for the parameters.

9.

发明申请
PROBABILISTIC TEXT INDEX FOR SEMI-STRUCTURED DATA IN COLUMNAR ANALYTICS STORAGE FORMATS 有权

公开(公告)号：US20220019784A1

公开(公告)日：2022-01-20

申请号：US16929949

申请日：2020-07-15

Applicant: Oracle International Corporation

Inventor： Jian Wen , Hamed Ahmadi , Sanjay Jinturkar , Nipun Agarwal , Lijian Wan , Shrikumar Hariharasubrahmanian

IPC: G06K9/00 , G06K9/62 , G06F16/13 , G06F40/289 , G06F21/62

Abstract: Herein is a probabilistic indexing technique for searching semi-structured text documents in columnar storage formats such as Parquet, using columnar input/output (I/O) avoidance, and needing minimal storage overhead. In an embodiment, a computer associates columns with text strings that occur in semi-structured documents. Text words that occur in the text strings are detected. Respectively for each text word, a bitmap, of a plurality of bitmaps, that contains a respective bit for each column is generated. Based on at least one of the bitmaps, some of the columns or some of the semi-structured documents are accessed.

10.

发明申请
CODE DICTIONARY GENERATION BASED ON NON-BLOCKING OPERATIONS 有权

公开(公告)号：US20210390089A1

公开(公告)日：2021-12-16

申请号：US17459447

申请日：2021-08-27

Applicant: Oracle International Corporation

Inventor： Pit Fender , Felix Schmidt , Benjamin Schlegel , Matthias Brantner , Nipun Agarwal

IPC: G06F16/23 , G06F16/22 , G06F16/28

Abstract: Techniques related to code dictionary generation based on non-blocking operations are disclosed. In some embodiments, a column of tokens includes a first token and a second token that are stored in separate rows. The column of tokens is correlated with a set of row identifiers including a first row identifier and a second row identifier that is different from the first row identifier. Correlating the column of tokens with the set of row identifiers involves: storing a correlation between the first token and the first row identifier, storing a correlation between the second token and the second row identifier if the first token and the second token have different values, and storing a correlation between the second token and the first row identifier if the first token and the second token have identical values. After correlating the column of tokens with the set of row identifiers, duplicate correlations are removed.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification