-
公开(公告)号:US11829419B1
公开(公告)日:2023-11-28
申请号:US17744653
申请日:2022-05-14
Applicant: ORACLE INTERNATIONAL CORPORATION
Inventor: Iraklis Psaroudakis , Mhd Yamen Haddad , Martin Sevenich
IPC: G06F16/23 , G06F16/901 , G06F16/903
CPC classification number: G06F16/9024 , G06F16/23 , G06F16/90335
Abstract: A system for loading graph data from an external store in response to a graph query is disclosed. In some embodiment, given a graph database where all vertices are stored in memory and some but not all edges are stored in the external store, the system performs one of two methods. In the first method, the system iteratively expands a set of vertices that is initially specified in the graph query and collects all edges connected to the set of vertices, including edges stored in the external store, that satisfy a vertex constraint also specified in the query. In the second method, the system finds a set of vertices that satisfy the vertex constraint and collects all edges connected to the set of vertices, including edges stored in an external store.
-
公开(公告)号:US20240330130A1
公开(公告)日:2024-10-03
申请号:US18740689
申请日:2024-06-12
Applicant: Oracle International Corporation
Inventor: Miroslav Cepek , Iraklis Psaroudakis , Rhicheek Patra , Timothy Trovatelli
CPC classification number: G06F11/1476 , G06N3/04 , G06V30/18181
Abstract: Herein is machine learning for anomalous graph detection based on graph embedding, shuffling, comparison, and unsupervised training techniques that can characterize an unfamiliar graph. In an embodiment, a computer obtains many known vectors that respectively represent known graphs. A new vector is generated that represents a new graph that contains multiple vertices. The new vector may contain an arithmetic aggregation of vertex vectors that respectively represent multiple vertices and/or a vector that represents a virtual vertex that is connected to the multiple vertices by respective virtual edges. In the many known vectors, some similar vectors that are similar to the new vector are identified. The new graph is automatically characterized based on a subset of the known graphs that the similar vectors represent.
-
公开(公告)号:US12079282B2
公开(公告)日:2024-09-03
申请号:US16989306
申请日:2020-08-10
Applicant: ORACLE INTERNATIONAL CORPORATION
Inventor: Aras Mumcuyan , Iraklis Psaroudakis , Miroslav Cepek , Rhicheek Patra
IPC: G06F40/00 , G06F16/903 , G06F18/2113 , G06F18/214 , G06F18/22 , G06F40/30 , G06N3/045 , G06N3/08 , G06N5/04
CPC classification number: G06F16/90344 , G06F18/2113 , G06F18/214 , G06F18/22 , G06F40/30 , G06N3/045 , G06N3/08 , G06N5/04
Abstract: Techniques are described herein for a Name Matching Engine that integrates two Machine Learning (ML) module options. The first ML module is a feature-engineered classifier that boosts text-based name matching techniques with a binary classifier ML model. The feature-engineered classifier comprises a first stage of text-based candidate finding, and a second stage in which a binary classifier model predicts whether each string, of the candidate match list, is a match or not. The binary classifier model is based on features from two or more of: a name feature level, a word feature level, a character feature level, and an initial feature level. The second ML module of the Name Matching Engine comprises an end-to-end Recurrent Neural Network (RNN) model that directly accepts name strings as a sequence of n-grams and generates learned text embeddings. The text embeddings of matching name strings are close to each other in the feature space.
-
公开(公告)号:US12050522B2
公开(公告)日:2024-07-30
申请号:US17577711
申请日:2022-01-18
Applicant: Oracle International Corporation
Inventor: Miroslav Cepek , Iraklis Psaroudakis , Rhicheek Patra , Timothy Trovatelli
CPC classification number: G06F11/1476 , G06N3/04 , G06V30/18181
Abstract: Herein is machine learning for anomalous graph detection based on graph embedding, shuffling, comparison, and unsupervised training techniques that can characterize an unfamiliar graph. In an embodiment, a computer obtains many known vectors that respectively represent known graphs. A new vector is generated that represents a new graph that contains multiple vertices. The new vector may contain an arithmetic aggregation of vertex vectors that respectively represent multiple vertices and/or a vector that represents a virtual vertex that is connected to the multiple vertices by respective virtual edges. In the many known vectors, some similar vectors that are similar to the new vector are identified. The new graph is automatically characterized based on a subset of the known graphs that the similar vectors represent.
-
公开(公告)号:US20230229570A1
公开(公告)日:2023-07-20
申请号:US17577711
申请日:2022-01-18
Applicant: Oracle International Corporation
Inventor: Miroslav Cepek , Iraklis Psaroudakis , Rhicheek Patra , Timothy Trovatelli
CPC classification number: G06F11/1476 , G06V30/18181 , G06N3/04
Abstract: Herein is machine learning for anomalous graph detection based on graph embedding, shuffling, comparison, and unsupervised training techniques that can characterize an unfamiliar graph. In an embodiment, a computer obtains many known vectors that respectively represent known graphs. A new vector is generated that represents a new graph that contains multiple vertices. The new vector may contain an arithmetic aggregation of vertex vectors that respectively represent multiple vertices and/or a vector that represents a virtual vertex that is connected to the multiple vertices by respective virtual edges. In the many known vectors, some similar vectors that are similar to the new vector are identified. The new graph is automatically characterized based on a subset of the known graphs that the similar vectors represent.
-
公开(公告)号:US11593398B2
公开(公告)日:2023-02-28
申请号:US17067479
申请日:2020-10-09
Applicant: Oracle International Corporation
Inventor: Iraklis Psaroudakis , Stefan Kaestle , Daniel J. Goodman , Jean-Pierre Lozi , Matthias Grimmer , Timothy L. Harris
Abstract: Adaptive data collections may include various type of data arrays, sets, bags, maps, and other data structures. A simple interface for each adaptive collection may provide access via a unified API to adaptive implementations of the collection. A single adaptive data collection may include multiple, different adaptive implementations. A system configured to implement adaptive data collections may include the ability to adaptively select between various implementations, either manually or automatically, and to map a given workload to differing hardware configurations. Additionally, hardware resource needs of different configurations may be predicted from a small number of workload measurements. Adaptive data collections may provide language interoperability, such as by leveraging runtime compilation to build adaptive data collections and to compile and optimize implementation code and user code together. Adaptive data collections may also provide language-independent such that implementation code may be written once and subsequently used from multiple programming languages.
-
公开(公告)号:US10853137B2
公开(公告)日:2020-12-01
申请号:US16351377
申请日:2019-03-12
Applicant: Oracle International Corporation
Inventor: Vlad Ioan Haprian , Iraklis Psaroudakis , Alexander Weld , Oskar Van Rest , Sungpack Hong , Hassan Chafi
Abstract: Techniques are described herein for allocating and rebalancing computing resources for executing graph workloads in manner that increases system throughput. According to one embodiment, a method includes receiving a request to execute a graph processing workload on a dataset, identifying a plurality of graph operators that constitute the graph processing workload, and determining whether execution of each graph operator is processor intensive or memory intensive. The method also includes assigning a task weight for each graph operator of the plurality of graph operators, and performing, based on the assigned task weights, a first allocation of computing resources to execute the plurality of graph operators. Further, the method includes causing, according to the first allocation, execution of the plurality of graph operators by the computing resources, and monitoring computing resource usage of graph operators executed by the computing resources according to the first allocation. In addition, the method includes performing, responsive to monitoring computing resource usage, a second allocation of computing resources to execute the plurality of graph operators, and causing, according to the second allocation instead of according to the first allocation, execution of the plurality of graph operators by the computing resources.
-
公开(公告)号:US20200293372A1
公开(公告)日:2020-09-17
申请号:US16351377
申请日:2019-03-12
Applicant: Oracle International Corporation
Inventor: Vlad Ioan Haprian , Iraklis Psaroudakis , Alexander Weld , Oskar Van Rest , Sungpack Hong , Hassan Chafi
IPC: G06F9/50 , G06F9/38 , G06F16/901 , G06K9/62 , G06F11/30
Abstract: Techniques are described herein for allocating and rebalancing computing resources for executing graph workloads in manner that increases system throughput. According to one embodiment, a method includes receiving a request to execute a graph processing workload on a dataset, identifying a plurality of graph operators that constitute the graph processing workload, and determining whether execution of each graph operator is processor intensive or memory intensive. The method also includes assigning a task weight for each graph operator of the plurality of graph operators, and performing, based on the assigned task weights, a first allocation of computing resources to execute the plurality of graph operators. Further, the method includes causing, according to the first allocation, execution of the plurality of graph operators by the computing resources, and monitoring computing resource usage of graph operators executed by the computing resources according to the first allocation. In addition, the method includes performing, responsive to monitoring computing resource usage, a second allocation of computing resources to execute the plurality of graph operators, and causing, according to the second allocation instead of according to the first allocation, execution of the plurality of graph operators by the computing resources.
-
公开(公告)号:US12282486B2
公开(公告)日:2025-04-22
申请号:US17733011
申请日:2022-04-29
Applicant: ORACLE INTERNATIONAL CORPORATION
Inventor: Iraklis Psaroudakis , Giulia Carocari , Andrea Ziani , Miroslav Cepek
IPC: G06F16/2457 , G06F16/2458 , G06F16/29 , G06N3/08 , G06N3/10
Abstract: Techniques are described herein for address matching from a single address string to an address matching score. In an embodiment, an address string is received and parsed into parsed address data. Once an address string is parsed into parsed address data, the parsed address data is standardized by converting the parsed address data into a standard format and replacing abbreviations, colloquial names with formal names. Once an address string has been standardized into a standardized street locale, candidate addresses that are identical to or similar to the standardized street locale are identified and are assigned a score. Each score comprises a probability that the respective candidate address and the standardized street locale represent a same place or location.
-
公开(公告)号:US20240370500A1
公开(公告)日:2024-11-07
申请号:US18773452
申请日:2024-07-15
Applicant: Oracle International Corporation
Inventor: Aras Mumcuyan , Iraklis Psaroudakis , Miroslav Cepek , Rhicheek Patra
IPC: G06F16/903 , G06F18/2113 , G06F18/214 , G06F18/22 , G06F40/30 , G06N3/045 , G06N3/08 , G06N5/04
Abstract: Techniques are described herein for a Name Matching Engine that integrates two Machine Learning (ML) module options. The first ML module is a feature-engineered classifier that boosts text-based name matching techniques with a binary classifier ML model. The feature-engineered classifier comprises a first stage of text-based candidate finding, and a second stage in which a binary classifier model predicts whether each string, of the candidate match list, is a match or not. The binary classifier model is based on features from two or more of: a name feature level, a word feature level, a character feature level, and an initial feature level. The second ML module of the Name Matching Engine comprises an end-to-end Recurrent Neural Network (RNN) model that directly accepts name strings as a sequence of n-grams and generates learned text embeddings. The text embeddings of matching name strings are close to each other in the feature space.
-
-
-
-
-
-
-
-
-