-
公开(公告)号:US12260306B2
公开(公告)日:2025-03-25
申请号:US17891350
申请日:2022-08-19
Applicant: Oracle International Corporation
Inventor: Kenyu Kobayashi , Arno Schneuwly , Renata Khasanova , Matteo Casserini , Felix Schmidt
Abstract: Herein is a machine learning (ML) explainability (MLX) approach in which a natural language explanation is generated based on analysis of a parse tree such as for a suspicious database query or web browser JavaScript. In an embodiment, a computer selects, based on a respective relevance score for each non-leaf node in a parse tree of a statement, a relevant subset of non-leaf nodes. The non-leaf nodes are grouped in the parse tree into groups that represent respective portions of the statement. Based on a relevant subset of the groups that contain at least one non-leaf node in the relevant subset of non-leaf nodes, a natural language explanation of why the statement is anomalous is generated.
-
公开(公告)号:US20240419943A1
公开(公告)日:2024-12-19
申请号:US18209024
申请日:2023-06-13
Applicant: Oracle International Corporation
Inventor: Renata Khasanova , Aneesh Dahiya , Felix Schmidt
IPC: G06N3/0455 , G06N3/084
Abstract: A computer performs deduplication of an original training corpus for maintaining accuracy of accelerated training of a reconstructive or other machine learning (ML) model. Distinct multidimensional points are detected in the original training corpus that contains duplicates. Based on duplicates in the original training corpus, a respective observed frequency of each distinct multidimensional point is increased. In a reconstructive embodiment and based on a particular distinct multidimensional point as input, a reconstruction of the particular distinct multidimensional point is generated by a reconstructive ML model. Based on increasing the observed frequency of the particular distinct multidimensional point, a scaled error of the reconstruction of the particular distinct multidimensional point is increased. Based on the scaled error of the reconstruction of the particular distinct multidimensional point, accuracy of the reconstructive model is increased. In an embodiment, the reconstructive ML model is an artificial neural network that is a denoising autoencoder that detects anomalous database statements.
-
公开(公告)号:US12020131B2
公开(公告)日:2024-06-25
申请号:US17221212
申请日:2021-04-02
Applicant: Oracle International Corporation
Inventor: Saeid Allahdadian , Amin Suzani , Milos Vasic , Matteo Casserini , Andrew Brownsword , Felix Schmidt , Nipun Agarwal
IPC: G06N20/20 , G06N3/04 , G06N3/0442 , G06N3/045 , G06N3/0495 , G06N3/08 , G06N3/088 , G06N20/00
CPC classification number: G06N20/20 , G06N3/04 , G06N3/0495 , G06N3/08 , G06N3/088 , G06N3/0442 , G06N3/045 , G06N20/00
Abstract: Techniques are provided for sparse ensembling of unsupervised machine learning models. In an embodiment, the proposed architecture is composed of multiple unsupervised machine learning models that each produce a score as output and a gating network that analyzes the inputs and outputs of the unsupervised machine learning models to select an optimal ensemble of unsupervised machine learning models. The gating network is trained to choose a minimal number of the multiple unsupervised machine learning models whose scores are combined to create a final score that matches or closely resembles a final score that is computed using all the scores of the multiple unsupervised machine learning models.
-
公开(公告)号:US11947515B2
公开(公告)日:2024-04-02
申请号:US17752766
申请日:2022-05-24
Applicant: Oracle International Corporation
Inventor: Pit Fender , Felix Schmidt , Benjamin Schlegel
CPC classification number: G06F16/2282 , G06F7/08 , G06F16/212 , G06F16/221 , G06F16/258 , H03M7/3088
Abstract: Unsorted sparse dictionary encodings are transformed into unsorted-dense or sorted-dense dictionary encodings. Sparse domain codes have large gaps between codes that are adjacent in order. Unlike spare codes, dense codes have smaller gaps between adjacent codes; consecutive codes are dense codes that have no gaps between adjacent codes. The techniques described herein are relational approaches that may be used to generate sparse composite codes and sorted codes.
-
35.
公开(公告)号:US11704386B2
公开(公告)日:2023-07-18
申请号:US17199563
申请日:2021-03-12
Applicant: Oracle International Corporation
Inventor: Amin Suzani , Saeid Allahdadian , Milos Vasic , Matteo Casserini , Hamed Ahmadi , Felix Schmidt , Andrew Brownsword , Nipun Agarwal
IPC: G06F18/214 , G06N20/00 , G06V10/75 , G06F18/23
CPC classification number: G06F18/214 , G06F18/23 , G06N20/00 , G06V10/758
Abstract: Herein are feature extraction mechanisms that receive parsed log messages as inputs and transform them into numerical feature vectors for machine learning models (MLMs). In an embodiment, a computer extracts fields from a log message. Each field specifies a name, a text value, and a type. For each field, a field transformer for the field is dynamically selected based the field's name and/or the field's type. The field transformer converts the field's text value into a value of the field's type. A feature encoder for the value of the field's type is dynamically selected based on the field's type and/or a range of the field's values that occur in a training corpus of an MLM. From the feature encoder, an encoding of the value of the field's typed is stored into a feature vector. Based on the MLM and the feature vector, the log message is detected as anomalous.
-
公开(公告)号:US11620118B2
公开(公告)日:2023-04-04
申请号:US17175250
申请日:2021-02-12
Applicant: Oracle International Corporation
Inventor: Arno Schneuwly , Nikola Milojkovic , Felix Schmidt , Nipun Agarwal
Abstract: Herein are machine learning (ML) feature processing and analytic techniques to detect anomalies in parse trees of logic statements, database queries, logic scripts, compilation units of general-purpose programing language, extensible markup language (XML), JavaScript object notation (JSON), and document object models (DOM). In an embodiment, a computer identifies an operational trace that contains multiple parse trees. Values of explicit features are generated from a single respective parse tree of the multiple parse trees of the operational trace. Values of implicit features are generated from more than one respective parse tree of the multiple parse trees of the operational trace. The explicit and implicit features are stored into a same feature vector. With the feature vector as input, an ML model detects whether or not the operational trace is anomalous, based on the explicit features of each parse tree of the operational trace and the implicit features of multiple parse trees of the operational trace.
-
公开(公告)号:US11449517B2
公开(公告)日:2022-09-20
申请号:US17131299
申请日:2020-12-22
Applicant: Oracle International Corporation
Inventor: Arno Schneuwly , Nikola Milojkovic , Felix Schmidt , Nipun Agarwal
IPC: G06F16/2457 , G06N20/00 , G06F16/93
Abstract: Approaches herein relate to machine learning for detection of anomalous logic syntax. Herein is acceleration for comparison of parse trees such as suspicious database queries. In an embodiment, a computer identifies subtrees in each of many trees. A respective subset of participating subtrees is selected in each tree. A respective root node of each participating subtree should directly have a child node that is a leaf and/or should have a degree that exceeds a branching threshold such as one. For each pairing of a respective first tree with a respective second tree, based on a count of subtree matches between the participating subset of subtrees in the first tree and the participating subset of subtrees in the second tree, a respective tree similarity score is calculated. A machine learning model inferences based on the tree similarity scores of the many trees. In an embodiment, each tree similarity score is a convolution kernel.
-
公开(公告)号:US20220261228A1
公开(公告)日:2022-08-18
申请号:US17175250
申请日:2021-02-12
Applicant: Oracle International Corporation
Inventor: Arno Schneuwly , Nikola Milojkovic , Felix Schmidt , Nipun Agarwal
Abstract: Herein are machine learning (ML) feature processing and analytic techniques to detect anomalies in parse trees of logic statements, database queries, logic scripts, compilation units of general-purpose programing language, extensible markup language (XML), JavaScript object notation (JSON), and document object models (DOM). In an embodiment, a computer identifies an operational trace that contains multiple parse trees. Values of explicit features are generated from a single respective parse tree of the multiple parse trees of the operational trace. Values of implicit features are generated from more than one respective parse tree of the multiple parse trees of the operational trace. The explicit and implicit features are stored into a same feature vector. With the feature vector as input, an ML model detects whether or not the operational trace is anomalous, based on the explicit features of each parse tree of the operational trace and the implicit features of multiple parse trees of the operational trace.
-
39.
公开(公告)号:US20220188694A1
公开(公告)日:2022-06-16
申请号:US17122401
申请日:2020-12-15
Applicant: Oracle International Corporation
Inventor: Amin Suzani , Matteo Casserini , Milos Vasic , Saeid Allahdadian , Andrew Brownsword , Hamed Ahmadi , Felix Schmidt , Nipun Agarwal
Abstract: Approaches herein relate to model decay of an anomaly detector due to concept drift. Herein are machine learning techniques for dynamically self-tuning an anomaly score threshold. In an embodiment in a production environment, a computer receives an item in a stream of items. A machine learning (ML) model hosted by the computer infers by calculation an anomaly score for the item. Whether the item is anomalous or not is decided based on the anomaly score and an adaptive anomaly threshold that dynamically fluctuates. A moving standard deviation of anomaly scores is adjusted based on a moving average of anomaly scores. The moving average of anomaly scores is then adjusted based on the anomaly score. The adaptive anomaly threshold is then adjusted based on the moving average of anomaly scores and the moving standard deviation of anomaly scores.
-
公开(公告)号:US11036561B2
公开(公告)日:2021-06-15
申请号:US16044230
申请日:2018-07-24
Applicant: Oracle International Corporation
Inventor: Stuart Wray , Felix Schmidt , Craig Robert Schelp , Manel Fernandez Gomez , Nipun Agarwal
IPC: G06F9/50 , H04L12/803 , H04L12/26
Abstract: Embodiments monitor statistics from groups of devices and generate an alarm upon detecting a utilization imbalance that is beyond a threshold. Particular balance statistics are periodically sampled, over a timeframe, for a group of devices configured to have balanced utilization. The devices are ranked at every data collection timestamp based on the gathered device statistics. The numbers of times each device appears within each rank over the timeframe are tallied. The device/rank summations are collectively used as a probability distribution representing the probability of each device being ranked at each of the rankings in the future. Based on this probability distribution, an entropy value that represents a summary of the imbalance of the group of devices over the timeframe is derived. An imbalance alert is generated when one or more entropy values for a group of devices shows an imbalanced utilization of the devices going beyond an identified imbalance threshold.
-
-
-
-
-
-
-
-
-