Patent search ap:("Oracle International Corporation") AND inv:"Vijayalakshmi Krishnamurthy" Page 3

21.

发明申请
Integrating Data Quality Analyses For Modeling Metrics 有权

公开(公告)号：US20250005456A1

公开(公告)日：2025-01-02

申请号：US18766438

申请日：2024-07-08

Applicant: Oracle International Corporation

Inventor： Amit Vaid , Vijayalakshmi Krishnamurthy

IPC: G06N20/00 , G06F18/21 , G06F18/2113

Abstract: Techniques for generating a composite score for data quality are disclosed. Univariate analysis is performed on a plurality of data points corresponding to each of a first feature, a second feature, and a third feature of a data set. The univariate analysis includes at least a first type of analysis generating a first score having a first range of possible values, and a second type of analysis generating a second score having a second range of possible values. A first quality score is computed for the data values for the first, second, and third features based on a normalized first score and a normalized second score. Machine learning is performed on the data points corresponding to one or both of the first feature and the second feature having a first quality score above a threshold value to model the third feature.

22.

发明授权
Selecting an algorithm for analyzing a data set based on the distribution of the data set 有权

公开(公告)号：US11568179B2

公开(公告)日：2023-01-31

申请号：US16438969

申请日：2019-06-12

Applicant: Oracle International Corporation

Inventor： Joseph Marc Posner , Sunil Kumar Kunisetty , Mohan Kamath , Nickolas Kavantzas , Sachin Bhatkar , Sergey Troshin , Sujay Sarkhel , Shivakumar Subramanian Govindarajapuram , Vijayalakshmi Krishnamurthy

IPC: G06N20/00 , G06F16/00 , G06K9/62 , G06F16/9537 , G06F16/957 , G06F16/58 , G06N5/04 , G06N5/02

Abstract: A model analyzer may receive a representative data set as input and select one of a plurality of analytic models to perform the analysis. Before deciding which model to use the model may be trained, and the trained model evaluated for accuracy. However, some models are known to behave poorly when the training data is distributed in a particular way. Thus, the cost of training a model and evaluating the trained model can be avoided by first analyzing the distribution of the representative data. Identifying the representative data distribution allows ruling out use of models for which the distribution of the representative data is unsuitable. Only models that may be compatible with the distribution of the representative data may be trained and evaluated for accuracy. The most accurate trained model whose accuracy meets an accuracy threshold may be selected to analyze subsequently received data related to the representative data.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification