-
公开(公告)号:US20240345811A1
公开(公告)日:2024-10-17
申请号:US18202756
申请日:2023-05-26
Applicant: Oracle International Corporation
Inventor: Arno Schneuwly , Saeid Allahdadian , Pritam Dash , Matteo Casserini , Felix Schmidt , Eric Sedlar
IPC: G06F8/36 , G06F16/955 , G06F40/40
CPC classification number: G06F8/36 , G06F16/955 , G06F40/40
Abstract: Herein for each source logic in a corpus, a computer stores an identifier of the source logic and operates a logic encoder that infers a distinct fixed-size encoded logic that represents the variable-size source logic. At build time, a multidimensional index is generated and populated based on the encoded logics that represent the source logics in the corpus. At runtime, a user may edit and select a new source logic such as in a text editor or an integrated development environment (IDE). The logic encoder infers a new encoded logic that represents the new source logic. The multidimensional index accepts the new encoded logic as a lookup key and automatically selects and returns a result subset of encoded logics that represent similar source logics in the corpus. For display, the multidimensional index may select and return only encoded logics that are the few nearest neighbors to the new encoded logic.
-
公开(公告)号:US20240345815A1
公开(公告)日:2024-10-17
申请号:US18202564
申请日:2023-05-26
Applicant: Oracle International Corporation
Inventor: Pritam Dash , Arno Schneuwly , Saeid Allahdadian , Matteo Casserini , Felix Schmidt
IPC: G06F8/41
CPC classification number: G06F8/427
Abstract: In an embodiment, a computer stores and operates a logic encoder that is an artificial neural network that infers a fixed-size encoded logic from textual or tokenized source logic. Without machine learning, a special parser generates a parse tree that represents the source logic and a fixed-size correctly encoded tree that represents the parse tree. For finetuning the logic encoder, an encoded tree generator is an artificial neural network that accepts the fixed-size encoded logic as input and responsively infers a fixed-size incorrectly encoded tree that represents the parse tree. The neural weights of the logic encoder (and optionally of the encoded tree generator) are adjusted based on backpropagation of error (i.e. loss) as a numerically measured difference between the fixed-size incorrectly encoded tree and the fixed-size correctly encoded tree.
-
公开(公告)号:US20250165852A1
公开(公告)日:2025-05-22
申请号:US18514391
申请日:2023-11-20
Applicant: Oracle International Corporation
Inventor: Tomas Feith , Arno Schneuwly , Saeid Allahdadian , Matteo Casserini , Felix Schmidt
IPC: G06N20/00
Abstract: During pretraining, a computer generates three untrained machine learning models that are a token sequence encoder, a token predictor, and a decoder that infers a frequency distribution of graph traversal paths. A sequence of lexical tokens is generated that represents a lexical text in a training corpus. A graph is generated that represents the lexical text. In the graph, multiple traversal paths are selected that collectively represent a sliding subsequence of the sequence of lexical tokens. From the subsequence, the token sequence encoder infers an encoded sequence that represents the subsequence of the sequence of lexical tokens. The decoder and token predictor accept the encoded sequence as input for respective inferencing for which respective training losses are measured. Both training losses are combined into a combined loss that is used to increase the accuracy of the three machine learning models by, for example, backpropagation of the combined loss.
-
公开(公告)号:US20250110961A1
公开(公告)日:2025-04-03
申请号:US18374209
申请日:2023-09-28
Applicant: Oracle International Corporation
Inventor: Tomas Feith , Arno Schneuwly , Saeid Allahdadian , Matteo Casserini , Kristopher Leland Rice , Felix Schmidt
IPC: G06F16/2457 , G06F16/248
Abstract: Here is dynamic and contextual ranking of reference documentation based on an interactively selected position in new source logic. A computer receives a vocabulary of lexical tokens, a sequence of references that contains a first reference to a first reference document before a second reference to a second reference document, respective subsets of the vocabulary that occur in the first and second reference documents, a new source logic that contains a sequence of lexical tokens, respective measurements of semantic distance between the new source logic and the first and second reference documents, and a selected position in the sequence of lexical tokens. Based on the selected position, the measurements of semantic distance are selectively increased. Based on that increasing the measurements of the semantic distance, a relative ordering of the first and second references is reversed to generate and display a reordered sequence of references.
-
公开(公告)号:US12020131B2
公开(公告)日:2024-06-25
申请号:US17221212
申请日:2021-04-02
Applicant: Oracle International Corporation
Inventor: Saeid Allahdadian , Amin Suzani , Milos Vasic , Matteo Casserini , Andrew Brownsword , Felix Schmidt , Nipun Agarwal
IPC: G06N20/20 , G06N3/04 , G06N3/0442 , G06N3/045 , G06N3/0495 , G06N3/08 , G06N3/088 , G06N20/00
CPC classification number: G06N20/20 , G06N3/04 , G06N3/0495 , G06N3/08 , G06N3/088 , G06N3/0442 , G06N3/045 , G06N20/00
Abstract: Techniques are provided for sparse ensembling of unsupervised machine learning models. In an embodiment, the proposed architecture is composed of multiple unsupervised machine learning models that each produce a score as output and a gating network that analyzes the inputs and outputs of the unsupervised machine learning models to select an optimal ensemble of unsupervised machine learning models. The gating network is trained to choose a minimal number of the multiple unsupervised machine learning models whose scores are combined to create a final score that matches or closely resembles a final score that is computed using all the scores of the multiple unsupervised machine learning models.
-
6.
公开(公告)号:US11704386B2
公开(公告)日:2023-07-18
申请号:US17199563
申请日:2021-03-12
Applicant: Oracle International Corporation
Inventor: Amin Suzani , Saeid Allahdadian , Milos Vasic , Matteo Casserini , Hamed Ahmadi , Felix Schmidt , Andrew Brownsword , Nipun Agarwal
IPC: G06F18/214 , G06N20/00 , G06V10/75 , G06F18/23
CPC classification number: G06F18/214 , G06F18/23 , G06N20/00 , G06V10/758
Abstract: Herein are feature extraction mechanisms that receive parsed log messages as inputs and transform them into numerical feature vectors for machine learning models (MLMs). In an embodiment, a computer extracts fields from a log message. Each field specifies a name, a text value, and a type. For each field, a field transformer for the field is dynamically selected based the field's name and/or the field's type. The field transformer converts the field's text value into a value of the field's type. A feature encoder for the value of the field's type is dynamically selected based on the field's type and/or a range of the field's values that occur in a training corpus of an MLM. From the feature encoder, an encoding of the value of the field's typed is stored into a feature vector. Based on the MLM and the feature vector, the log message is detected as anomalous.
-
公开(公告)号:US20220188694A1
公开(公告)日:2022-06-16
申请号:US17122401
申请日:2020-12-15
Applicant: Oracle International Corporation
Inventor: Amin Suzani , Matteo Casserini , Milos Vasic , Saeid Allahdadian , Andrew Brownsword , Hamed Ahmadi , Felix Schmidt , Nipun Agarwal
Abstract: Approaches herein relate to model decay of an anomaly detector due to concept drift. Herein are machine learning techniques for dynamically self-tuning an anomaly score threshold. In an embodiment in a production environment, a computer receives an item in a stream of items. A machine learning (ML) model hosted by the computer infers by calculation an anomaly score for the item. Whether the item is anomalous or not is decided based on the anomaly score and an adaptive anomaly threshold that dynamically fluctuates. A moving standard deviation of anomaly scores is adjusted based on a moving average of anomaly scores. The moving average of anomaly scores is then adjusted based on the anomaly score. The adaptive anomaly threshold is then adjusted based on the moving average of anomaly scores and the moving standard deviation of anomaly scores.
-
8.
公开(公告)号:US20250060951A1
公开(公告)日:2025-02-20
申请号:US18235461
申请日:2023-08-18
Applicant: Oracle International Corporation
Inventor: Tomas Feith , Arno Schneuwly , Saeid Allahdadian , Matteo Casserini , Felix Schmidt
IPC: G06F8/41 , G06F16/901
Abstract: In an embodiment providing natural language processing (NLP), a computer generates a histogram that correctly represents a graph that represents a lexical text, and generates a token sequence encoder that is trainable and untrained. During training such as pretraining, the token sequence encoder infers an encoded sequence that incorrectly represents the lexical text, and the encoded sequence is dense and saves space. To increase the accuracy of the sequence encoder by learning, the token sequence encoder is adjusted based on, as discussed herein, an indirectly measured numeric difference between the encoded sequence that incorrectly represents the lexical text and the histogram that correctly represents the graph.
-
公开(公告)号:US20250036934A1
公开(公告)日:2025-01-30
申请号:US18227758
申请日:2023-07-28
Applicant: Oracle International Corporation
Inventor: Tomas Feith , Arno Schneuwly , Saeid Allahdadian , Matteo Casserini , Felix Schmidt
IPC: G06N3/08
Abstract: Herein is validation of a trained classifier based on novel and accelerated estimation of a confusion matrix. In an embodiment, a computer hosts a trained classifier that infers, from many objects, an inferred frequency of each class. An upscaled magnitude of each class is generated from the inferred frequency of the class. An integer of each class is generated from the upscaled magnitude of the class. Based on those integers of the classes and a target integer for each class, counts are generated of the objects that are true positives, false positives, and false negatives of the class. Based on those counts, an estimated total of true positives, false positives, false negatives are generated that characterizes fitness of the trained classifier. In an embodiment, those counts and totals are downscaled to be fractions from zero to one.
-
公开(公告)号:US12143408B2
公开(公告)日:2024-11-12
申请号:US17739968
申请日:2022-05-09
Applicant: Oracle International Corporation
Inventor: Milos Vasic , Saeid Allahdadian , Matteo Casserini , Felix Schmidt , Andrew Brownsword
Abstract: Techniques for implementing a semi-supervised framework for purpose-oriented anomaly detection are provided. In one technique, a data item in inputted into an unsupervised anomaly detection model, which generates first output. Based on the first output, it is determined whether the data item represents an anomaly. In response to determining that the data item represents an anomaly, the data item is inputted into a supervised classification model, which generates second output that indicates whether the data item is unknown. In response to determining that the data item is unknown, a training instance is generated based on the data item. The supervised classification model is updated based on the training instance.
-
-
-
-
-
-
-
-
-