-
公开(公告)号:US10902203B2
公开(公告)日:2021-01-26
申请号:US16392386
申请日:2019-04-23
Applicant: Oracle International Corporation
Inventor: Rhicheek Patra , Davide Bartolini , Sungpack Hong , Hassan Chafi , Alberto Parravicini
IPC: G06F40/295 , G06N5/02
Abstract: Techniques are described herein for performing named entity disambiguation. According to an embodiment, a method includes receiving input text, extracting a first mention and a second mention from the input text, and selecting, from a knowledge graph, a plurality of first candidate vertices for the first mention and a plurality of second candidate vertices for the second mention. The present method also includes evaluating a score function that analyzes vertex embedding similarity between the plurality of first candidate vertices and the plurality of second candidate vertices. In response to evaluating and seeking to optimize the score function, the method performs selecting a first selected candidate vertex from the plurality of first candidate vertices and a second selected candidate vertex from the plurality of second candidate vertices. Further, the present method includes mapping a first entry from the knowledge graph to the first mention and mapping a second entry from the knowledge graph to the second mention. In this embodiment, the first entry corresponds to the first selected candidate vertex and the second entry corresponds to the second selected candidate.
-
公开(公告)号:US20250119453A1
公开(公告)日:2025-04-10
申请号:US18954031
申请日:2024-11-20
Applicant: Oracle International Corporation
Inventor: Valentin Venzin , Rhicheek Patra , Sungpack Hong , Hassan Chafi
Abstract: Herein are graph machine learning explainability (MLX) techniques for invalid traffic detection. In an embodiment, a computer generates a graph that contains: a) domain vertices that represent network domains that received requests and b) address vertices that respectively represent network addresses from which the requests originated. Based on the graph, domain embeddings are generated that respectively encode the domain vertices. Based on the domain embeddings, multidomain embeddings are generated that respectively encode the network addresses. The multidomain embeddings are organized into multiple clusters of multidomain embeddings. A particular cluster is detected as suspicious. In an embodiment, an unsupervised trained graph model generates the multidomain embeddings. Based on the clusters of multidomain embeddings, feature importances are unsupervised trained. Based on the feature importances, an explanation is automatically generated for why an object is or is not suspicious. The explained object may be a cluster or other batch of network addresses or a single network address.
-
公开(公告)号:US12079282B2
公开(公告)日:2024-09-03
申请号:US16989306
申请日:2020-08-10
Applicant: ORACLE INTERNATIONAL CORPORATION
Inventor: Aras Mumcuyan , Iraklis Psaroudakis , Miroslav Cepek , Rhicheek Patra
IPC: G06F40/00 , G06F16/903 , G06F18/2113 , G06F18/214 , G06F18/22 , G06F40/30 , G06N3/045 , G06N3/08 , G06N5/04
CPC classification number: G06F16/90344 , G06F18/2113 , G06F18/214 , G06F18/22 , G06F40/30 , G06N3/045 , G06N3/08 , G06N5/04
Abstract: Techniques are described herein for a Name Matching Engine that integrates two Machine Learning (ML) module options. The first ML module is a feature-engineered classifier that boosts text-based name matching techniques with a binary classifier ML model. The feature-engineered classifier comprises a first stage of text-based candidate finding, and a second stage in which a binary classifier model predicts whether each string, of the candidate match list, is a match or not. The binary classifier model is based on features from two or more of: a name feature level, a word feature level, a character feature level, and an initial feature level. The second ML module of the Name Matching Engine comprises an end-to-end Recurrent Neural Network (RNN) model that directly accepts name strings as a sequence of n-grams and generates learned text embeddings. The text embeddings of matching name strings are close to each other in the feature space.
-
公开(公告)号:US12050522B2
公开(公告)日:2024-07-30
申请号:US17577711
申请日:2022-01-18
Applicant: Oracle International Corporation
Inventor: Miroslav Cepek , Iraklis Psaroudakis , Rhicheek Patra , Timothy Trovatelli
CPC classification number: G06F11/1476 , G06N3/04 , G06V30/18181
Abstract: Herein is machine learning for anomalous graph detection based on graph embedding, shuffling, comparison, and unsupervised training techniques that can characterize an unfamiliar graph. In an embodiment, a computer obtains many known vectors that respectively represent known graphs. A new vector is generated that represents a new graph that contains multiple vertices. The new vector may contain an arithmetic aggregation of vertex vectors that respectively represent multiple vertices and/or a vector that represents a virtual vertex that is connected to the multiple vertices by respective virtual edges. In the many known vectors, some similar vectors that are similar to the new vector are identified. The new graph is automatically characterized based on a subset of the known graphs that the similar vectors represent.
-
公开(公告)号:US20230229570A1
公开(公告)日:2023-07-20
申请号:US17577711
申请日:2022-01-18
Applicant: Oracle International Corporation
Inventor: Miroslav Cepek , Iraklis Psaroudakis , Rhicheek Patra , Timothy Trovatelli
CPC classification number: G06F11/1476 , G06V30/18181 , G06N3/04
Abstract: Herein is machine learning for anomalous graph detection based on graph embedding, shuffling, comparison, and unsupervised training techniques that can characterize an unfamiliar graph. In an embodiment, a computer obtains many known vectors that respectively represent known graphs. A new vector is generated that represents a new graph that contains multiple vertices. The new vector may contain an arithmetic aggregation of vertex vectors that respectively represent multiple vertices and/or a vector that represents a virtual vertex that is connected to the multiple vertices by respective virtual edges. In the many known vectors, some similar vectors that are similar to the new vector are identified. The new graph is automatically characterized based on a subset of the known graphs that the similar vectors represent.
-
公开(公告)号:US11526673B2
公开(公告)日:2022-12-13
申请号:US17153078
申请日:2021-01-20
Applicant: Oracle International Corporation
Inventor: Rhicheek Patra , Davide Bartolini , Sungpack Hong , Hassan Chafi , Alberto Parravicini
IPC: G06F40/295 , G06N5/02
Abstract: According to an embodiment, a method includes converting a knowledge base into a graph. In this embodiment, the knowledge base contains a plurality of entities and specifies a plurality of relationships among the plurality of entities, and entities in the knowledge base correspond to vertices in the graph, and relationships between entities in the knowledge base correspond to edges between vertices in the graph. The method may also include extracting a plurality of vertex embeddings from the graph. An example vertex embedding of the plurality of vertex embeddings represents, for a particular vertex, a proximity of the particular vertex to other vertices of the graph. Further, the method may include performing, based at least in part on the plurality of vertex embeddings, entity linking between input text and the knowledge base.
-
-
-
-
-