Patent search ap:("Oracle International Corporation") AND inv:"Renata Khasanova" Page 1

1.

发明公开
BACKPROPAGATION-BASED EXPLAINABILITY METHOD FOR UNSUPERVISED ANOMALY DETECTION MODELS BASED ON AUTOENCODER ARCHITECTURES 审中-公开

公开(公告)号：US20240037372A1

公开(公告)日：2024-02-01

申请号：US17873491

申请日：2022-07-26

Applicant: Oracle International Corporation

Inventor： Kenyu Kobayashi , Arno Schneuwly , Renata Khasanova , Matteo Casserini , Felix Felix Schmidt

IPC: G06N3/04 , G06N3/08

CPC classification number: G06N3/0454 , G06N3/088 , G06N3/084

Abstract: The present invention relates to machine learning (ML) explainability (MLX). Herein are techniques for a novel relevance propagation rule in layer-wise relevance propagation (LRP) for feature attribution-based explanation (ABX) for a reconstructive autoencoder. In an embodiment, a reconstruction layer of a reconstructive neural network in a computer generates a reconstructed tuple that is based on an original tuple that contains many features. A reconstruction residual cost function calculates a reconstruction error that measures a difference between the original tuple and the reconstructed tuple. Applied to the reconstruction error is a novel reconstruction relevance propagation rule that assigns a respective reconstruction relevance to each reconstruction neuron in the reconstruction layer. Based on the reconstruction relevance of the reconstruction neurons, a respective feature relevance of each feature is determined, from which an ABX explanation may be automatically generated.

2.

发明公开
SUPER-FEATURES FOR EXPLAINABILITY WITH PERTURBATION-BASED APPROACHES 审中-公开

公开(公告)号：US20230334343A1

公开(公告)日：2023-10-19

申请号：US17719617

申请日：2022-04-13

Applicant: Oracle International Corporation

Inventor： Renata Khasanova , Nikola Milojkovic , Matteo Casserini , Felix Schmidt

IPC: G06N5/04

CPC classification number: G06N5/04

Abstract: In an embodiment, a computer hosts a machine learning (ML) model that infers a particular inference for a particular tuple that is based on many features. The features are grouped into predefined super-features that each contain a disjoint (i.e. nonintersecting, mutually exclusive) subset of features. For each super-feature, the computer: a) randomly selects many permuted values from original values of the super-feature in original tuples, b) generates permuted tuples that are based on the particular tuple and a respective permuted value, and c) causes the ML model to infer a respective permuted inference for each permuted tuple. A surrogate model is trained based on the permuted inferences. For each super-feature, a respective importance of the super-feature is calculated based on the surrogate model. Super-feature importances may be used to rank super-features by influence and/or generate a local ML explainability (MLX) explanation.

3.

发明公开
VALIDATION METRIC FOR ATTRIBUTION-BASED EXPLANATION METHODS FOR ANOMALY DETECTION MODELS 审中-公开

公开(公告)号：US20240037383A1

公开(公告)日：2024-02-01

申请号：US17873482

申请日：2022-07-26

Applicant: Oracle International Corporation

Inventor： Kenyu Kobayashi , Arno Schneuwly , Renata Khasanova , Matteo Casserini , Felix Schmidt

IPC: G06N3/08

CPC classification number: G06N3/08

Abstract: Herein are machine learning (ML) explainability (MLX) techniques for calculating and using a novel fidelity metric for assessing and comparing explainers that are based on feature attribution. In an embodiment, a computer generates many anomalous tuples from many non-anomalous tuples. Each anomalous tuple contains a perturbed value of a respective perturbed feature. For each anomalous tuple, a respective explanation is generated that identifies a respective identified feature as a cause of the anomalous tuple being anomalous. A fidelity metric is calculated by counting correct explanations for the anomalous tuples whose identified feature is the perturbed feature. Tuples may represent entries in an activity log such as structured query language (SQL) statements in a console output log of a database server. This approach herein may gauge the quality of a set of MLX explanations for why log entries or network packets are characterized as anomalous by an intrusion detector or other anomaly detector.

4.

发明授权
Machine learning-based DNS request string representation with hash replacement 有权

公开(公告)号：US11784964B2

公开(公告)日：2023-10-10

申请号：US17197375

申请日：2021-03-10

Applicant: Oracle International Corporation

Inventor： Renata Khasanova , Felix Schmidt , Stuart Wray , Craig Schelp , Nipun Agarwal , Matteo Casserini

IPC: H04L61/4511 , G06N20/00 , H04L41/16 , G06F40/30

CPC classification number: H04L61/4511 , G06N20/00 , H04L41/16 , G06F40/30

Abstract: Techniques are described herein for using machine learning to learn vector representations of DNS requests such that the resulting embeddings represent the semantics of the DNS requests as a whole. Techniques described herein perform pre-processing of tokenized DNS request strings in which hashes, which are long and relatively random strings of characters, are detected in DNS request strings and each detected hash token is replaced with a placeholder token. A vectorizing ML model is trained using the pre-processed training dataset in which hash tokens have been replaced. Embeddings for the DNS tokens are derived from an intermediate layer of the vectorizing ML model. The encoding application creates final vector representations for each DNS request string by generating a weighted summation of the embeddings of all of the tokens in the DNS request string. Because of hash replacement, the resulting DNS request embeddings reflect semantics of the hashes as a group.

5.

发明授权
Textual explanations for abstract syntax trees with scored nodes 有权

公开(公告)号：US12260306B2

公开(公告)日：2025-03-25

申请号：US17891350

申请日：2022-08-19

Applicant: Oracle International Corporation

Inventor： Kenyu Kobayashi , Arno Schneuwly , Renata Khasanova , Matteo Casserini , Felix Schmidt

IPC: G06F8/41 , G06N20/00

Abstract: Herein is a machine learning (ML) explainability (MLX) approach in which a natural language explanation is generated based on analysis of a parse tree such as for a suspicious database query or web browser JavaScript. In an embodiment, a computer selects, based on a respective relevance score for each non-leaf node in a parse tree of a statement, a relevant subset of non-leaf nodes. The non-leaf nodes are grouped in the parse tree into groups that represent respective portions of the statement. Based on a relevant subset of the groups that contain at least one non-leaf node in the relevant subset of non-leaf nodes, a natural language explanation of why the statement is anomalous is generated.

6.

发明申请
EFFICIENT DATA DISTRIBUTION PRESERVING TRAINING PARADIGM 有权

公开(公告)号：US20240419943A1

公开(公告)日：2024-12-19

申请号：US18209024

申请日：2023-06-13

Applicant: Oracle International Corporation

Inventor： Renata Khasanova , Aneesh Dahiya , Felix Schmidt

IPC: G06N3/0455 , G06N3/084

Abstract: A computer performs deduplication of an original training corpus for maintaining accuracy of accelerated training of a reconstructive or other machine learning (ML) model. Distinct multidimensional points are detected in the original training corpus that contains duplicates. Based on duplicates in the original training corpus, a respective observed frequency of each distinct multidimensional point is increased. In a reconstructive embodiment and based on a particular distinct multidimensional point as input, a reconstruction of the particular distinct multidimensional point is generated by a reconstructive ML model. Based on increasing the observed frequency of the particular distinct multidimensional point, a scaled error of the reconstruction of the particular distinct multidimensional point is increased. Based on the scaled error of the reconstruction of the particular distinct multidimensional point, accuracy of the reconstructive model is increased. In an embodiment, the reconstructive ML model is an artificial neural network that is a denoising autoencoder that detects anomalous database statements.

7.

发明公开
PROFILE-ENRICHED EXPLANATIONS OF DATA-DRIVEN MODELS 审中-公开

公开(公告)号：US20240126798A1

公开(公告)日：2024-04-18

申请号：US18203195

申请日：2023-05-30

Applicant: Oracle International Corporation

Inventor： Arno Schneuwly , Desislava Wagenknecht-Dimitrova , Felix Schmidt , Marija Nikolic , Matteo Casserini , Milos Vasic , Renata Khasanova

IPC: G06F16/34 , G06F16/335 , G06F40/186

CPC classification number: G06F16/345 , G06F16/335 , G06F40/186

Abstract: In an embodiment, a computer stores, in memory or storage, many explanation profiles, many log entries, and definitions of many features that log entries contain. Some features may contain a logic statement such as a database query, and these are specially aggregated based on similarity. Based on the entity specified by an explanation profile, statistics are materialized for some or all features. Statistics calculation may be based on scheduled batches of log entries or a stream of live log entries. At runtime, an inference that is based on a new log entry is received. Based on an entity specified in the new log entry, a particular explanation profile is dynamically selected. Based on the new log entry and statistics of features for the selected explanation profile, a local explanation of the inference is generated. In an embodiment, an explanation text template is used to generate the local explanation.

8.

发明公开
SCORE PROPAGATION ON GRAPHS WITH DIFFERENT SUBGRAPH MAPPING STRATEGIES 审中-公开

公开(公告)号：US20240070156A1

公开(公告)日：2024-02-29

申请号：US17893519

申请日：2022-08-23

Applicant: Oracle International Corporation

Inventor： Kenyu Kobayashi , Arno Schneuwly , Renata Khasanova , Matteo Casserini , Felix Schmidt

IPC: G06F16/2457

CPC classification number: G06F16/24575

Abstract: Techniques for propagating scores in subgraphs are provided. In one technique, multiple path scores are stored, each path score associated with a path (or subgraph), of multiple paths, in a graph of nodes. The path scores may be generated by a machine-learned model. For each path score, a path that is associated with that path score is identified and nodes of that path are identified. For each identified node, a node score for that node is determined or computed based on the corresponding path score and the node score is stored in association with that node. Subsequently, for each node in a subset of the graph, multiple node scores that are associated with that node are identified and aggregated to generate a propagated score for that node. In a related technique, a propagated score of a node is used to compute a score for each leaf node of the node.

9.

发明公开
TEXTUAL EXPLANATIONS FOR ABSTRACT SYNTAX TREES WITH SCORED NODES 审中-公开

公开(公告)号：US20240061997A1

公开(公告)日：2024-02-22

申请号：US17891350

申请日：2022-08-19

Applicant: Oracle International Corporation

Inventor： Kenyu Kobayashi , Arno Schneuwly , Renata Khasanova , Matteo Casserini , Felix Schmidt

IPC: G06F40/205 , G06N20/00

CPC classification number: G06F40/205 , G06N20/00

Abstract: Herein is a machine learning (ML) explainability (MLX) approach in which a natural language explanation is generated based on analysis of a parse tree such as for a suspicious database query or web browser JavaScript. In an embodiment, a computer selects, based on a respective relevance score for each non-leaf node in a parse tree of a statement, a relevant subset of non-leaf nodes. The non-leaf nodes are grouped in the parse tree into groups that represent respective portions of the statement. Based on a relevant subset of the groups that contain at least one non-leaf node in the relevant subset of non-leaf nodes, a natural language explanation of why the statement is anomalous is generated.

10.

发明公开
TRACE REPRESENTATION LEARNING 审中-公开

公开(公告)号：US20230376743A1

公开(公告)日：2023-11-23

申请号：US17748226

申请日：2022-05-19

Applicant: Oracle International Corporation

Inventor： Marija Nikolic , Nikola Milojkovic , Arno Schneuwly , Matteo Casserini , Milos Vasic , Renata Khasanova , Felix Schmidt

IPC: G06N3/08 , G06N20/00

CPC classification number: G06N3/08 , G06N3/088 , G06N20/00

Abstract: The present invention avoids overfitting in deep neural network (DNN) training by using multitask learning (MTL) and self-supervised learning (SSL) techniques when training a multi-branch DNN to encode a sequence. In an embodiment, a computer first trains the DNN to perform a first task. The DNN contains: a first encoder in a first branch, a second encoder in a second branch, and an interpreter layer that combines data from the first branch and the second branch. The DNN second trains to perform a second task. After the first and second trainings, production encoding and inferencing occur. The first encoder encodes a sparse feature vector into a dense feature vector from which an inference is inferred. In an embodiment, a sequence of log messages is encoded into an encoded trace. An anomaly detector infers whether the sequence is anomalous. In an embodiment, the log messages are database commands.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification