Patent search ap:("Oracle International Corporation") AND inv:"Marija Nikolic" Page 1

1.

发明申请
One-Hot Encoder Using Lazy Evaluation Of Relational Statements 有权

公开(公告)号：US20250077519A1

公开(公告)日：2025-03-06

申请号：US18955689

申请日：2024-11-21

Applicant: Oracle International Corporation

Inventor： Felix Schmidt , Matteo Casserini , Milos Vasic , Marija Nikolic

IPC: G06F16/2453 , G06F16/2458

Abstract: A method and one or more non-transitory storage media are provided to train and implement a one-hot encoder. During a training phase, computation of an encoder state is performed by executing a set of relational statements to extract unique categories in a first training data set, associate each unique category with a unique index, and generate a one-hot encoding for each unique category. The set of relational statements are executed by a query optimization engine. Execution of the set of relational statements is postponed until a result of each relational statement is needed, and the query optimization engine implements one or more optimizations when executing the set of relational statements. During an encoding phase, a set of categorical features in a second training data set are encoded based on the encoder state to form a set of encoded categorical features.

2.

发明公开
PROFILE-ENRICHED EXPLANATIONS OF DATA-DRIVEN MODELS 审中-公开

公开(公告)号：US20240126798A1

公开(公告)日：2024-04-18

申请号：US18203195

申请日：2023-05-30

Applicant: Oracle International Corporation

Inventor： Arno Schneuwly , Desislava Wagenknecht-Dimitrova , Felix Schmidt , Marija Nikolic , Matteo Casserini , Milos Vasic , Renata Khasanova

IPC: G06F16/34 , G06F16/335 , G06F40/186

CPC classification number: G06F16/345 , G06F16/335 , G06F40/186

Abstract: In an embodiment, a computer stores, in memory or storage, many explanation profiles, many log entries, and definitions of many features that log entries contain. Some features may contain a logic statement such as a database query, and these are specially aggregated based on similarity. Based on the entity specified by an explanation profile, statistics are materialized for some or all features. Statistics calculation may be based on scheduled batches of log entries or a stream of live log entries. At runtime, an inference that is based on a new log entry is received. Based on an entity specified in the new log entry, a particular explanation profile is dynamically selected. Based on the new log entry and statistics of features for the selected explanation profile, a local explanation of the inference is generated. In an embodiment, an explanation text template is used to generate the local explanation.

3.

发明公开
TRACE REPRESENTATION LEARNING 审中-公开

公开(公告)号：US20230376743A1

公开(公告)日：2023-11-23

申请号：US17748226

申请日：2022-05-19

Applicant: Oracle International Corporation

Inventor： Marija Nikolic , Nikola Milojkovic , Arno Schneuwly , Matteo Casserini , Milos Vasic , Renata Khasanova , Felix Schmidt

IPC: G06N3/08 , G06N20/00

CPC classification number: G06N3/08 , G06N3/088 , G06N20/00

Abstract: The present invention avoids overfitting in deep neural network (DNN) training by using multitask learning (MTL) and self-supervised learning (SSL) techniques when training a multi-branch DNN to encode a sequence. In an embodiment, a computer first trains the DNN to perform a first task. The DNN contains: a first encoder in a first branch, a second encoder in a second branch, and an interpreter layer that combines data from the first branch and the second branch. The DNN second trains to perform a second task. After the first and second trainings, production encoding and inferencing occur. The first encoder encodes a sparse feature vector into a dense feature vector from which an inference is inferred. In an embodiment, a sequence of log messages is encoded into an encoded trace. An anomaly detector infers whether the sequence is anomalous. In an embodiment, the log messages are database commands.

4.

发明授权
One-hot encoder using lazy evaluation of relational statements 有权

公开(公告)号：US12182122B2

公开(公告)日：2024-12-31

申请号：US17964084

申请日：2022-10-12

Applicant: Oracle International Corporation

Inventor： Felix Schmidt , Matteo Casserini , Milos Vasic , Marija Nikolic

IPC: G06F16/00 , G06F16/2453 , G06F16/2458

Abstract: A method and one or more non-transitory storage media are provided to train and implement a one-hot encoder. During a training phase, computation of an encoder state is performed by executing a set of relational statements to extract unique categories in a first training data set, associate each unique category with a unique index, and generate a one-hot encoding for each unique category. The set of relational statements are executed by a query optimization engine. Execution of the set of relational statements is postponed until a result of each relational statement is needed, and the query optimization engine implements one or more optimizations when executing the set of relational statements. During an encoding phase, a set of categorical features in a second training data set are encoded based on the encoder state to form a set of encoded categorical features.

5.

发明公开
ANOMALY SCORE NORMALISATION BASED ON EXTREME VALUE THEORY 审中-公开

公开(公告)号：US20230368054A1

公开(公告)日：2023-11-16

申请号：US17745103

申请日：2022-05-16

Applicant: Oracle International Corporation

Inventor： Marija Nikolic , Matteo Casserini , Arno Schneuwly , Nikola Milojkovic , Milos Vasic , Renata Khasanova , Felix Schmidt

IPC: G06N7/00 , G06N20/00

CPC classification number: G06N7/005 , G06N20/00

Abstract: The present invention relates to threshold estimation and calibration for anomaly detection. Herein are machine learning (ML) and extreme value theory (EVT) techniques for normalizing and thresholding anomaly scores without presuming a values distribution. In an embodiment, a computer receives many unnormalized anomaly scores and, according to peak over threshold (POT), selects a highest subset of the unnormalized anomaly scores that exceed a tail threshold. Based on the highest subset of the unnormalized anomaly scores, parameters of a probability density function are trained according to EVT. After training and in a production environment, a normalized anomaly score is generated based on an unnormalized anomaly score and the trained parameters of the probability density function. Anomaly detection compares the normalized anomaly score to an optimized anomaly threshold.

6.

发明申请
ENCODING LOG-SPECIFIC ATTRIBUTES WITH NLP MODELS 有权

公开(公告)号：US20250021759A1

公开(公告)日：2025-01-16

申请号：US18219763

申请日：2023-07-10

Applicant: Oracle International Corporation

Inventor： Samuele Meta , Aneesh Dahiya , Felix Schmidt , Marija Nikolic , Matteo Casserini , Milos Vasic

IPC: G06F40/284 , G06F11/34

Abstract: Herein is natural language processing (NLP) to detect an anomalous log entry using a language model that infers an encoding of the log entry from novel generation of numeric lexical tokens. In an embodiment, a computer extracts an original numeric lexical token from a variable sized log entry. Substitute numeric lexical token(s) that represent the original numeric lexical token are generated, such as with a numeric exponent or by trigonometry. The log entry does not contain the substitute numeric lexical token. A novel sequence of lexical tokens that represents the log entry and contains the substitute numeric lexical token is generated. The novel sequence of lexical tokens does not contain the original numeric lexical token. The computer hosts and operates a machine learning model that generates, based on the novel sequence of lexical tokens that represents the log entry, an inference that characterizes the log entry with unprecedented accuracy.

7.

发明申请
GENERAL PURPOSE SQL REPRESENTATION MODEL 有权

公开(公告)号：US20240370429A1

公开(公告)日：2024-11-07

申请号：US18143776

申请日：2023-05-05

Applicant: Oracle International Corporation

Inventor： Aneesh Dahiya , Matteo Casserini , Marija Nikolic , Milos Vasic , Samuele Meta , Nikola Milojkovic , Felix Schmidt

IPC: G06F16/2452 , G06N3/0455 , G06N3/08

Abstract: In an embodiment, a computer generates sentence fingerprints that represent respective pluralities of similar database statements. Based on the sentence fingerprints, an artificial neural network is trained. After training the artificial neural network on a large corpus of fingerprinted database statements, the artificial neural network is ready to be used for zero-shot transfer learning to a downstream task in training. Database statement fingerprinting also anonymizes literal values in raw SQL statements. The trained artificial neural network can be safely reused without risk of disclosing sensitive data in the artificial neural network's vocabulary. After training, the artificial neural network infers a fixed-size encoded database statement from a new database statement. Based on the fixed-size encoded database statement, the new database statement is detected as anomalous, which increases database security and preserves database throughput by not executing the anomalous database statement.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification