-
公开(公告)号:US12079282B2
公开(公告)日:2024-09-03
申请号:US16989306
申请日:2020-08-10
Applicant: ORACLE INTERNATIONAL CORPORATION
Inventor: Aras Mumcuyan , Iraklis Psaroudakis , Miroslav Cepek , Rhicheek Patra
IPC: G06F40/00 , G06F16/903 , G06F18/2113 , G06F18/214 , G06F18/22 , G06F40/30 , G06N3/045 , G06N3/08 , G06N5/04
CPC classification number: G06F16/90344 , G06F18/2113 , G06F18/214 , G06F18/22 , G06F40/30 , G06N3/045 , G06N3/08 , G06N5/04
Abstract: Techniques are described herein for a Name Matching Engine that integrates two Machine Learning (ML) module options. The first ML module is a feature-engineered classifier that boosts text-based name matching techniques with a binary classifier ML model. The feature-engineered classifier comprises a first stage of text-based candidate finding, and a second stage in which a binary classifier model predicts whether each string, of the candidate match list, is a match or not. The binary classifier model is based on features from two or more of: a name feature level, a word feature level, a character feature level, and an initial feature level. The second ML module of the Name Matching Engine comprises an end-to-end Recurrent Neural Network (RNN) model that directly accepts name strings as a sequence of n-grams and generates learned text embeddings. The text embeddings of matching name strings are close to each other in the feature space.
-
公开(公告)号:US20240370500A1
公开(公告)日:2024-11-07
申请号:US18773452
申请日:2024-07-15
Applicant: Oracle International Corporation
Inventor: Aras Mumcuyan , Iraklis Psaroudakis , Miroslav Cepek , Rhicheek Patra
IPC: G06F16/903 , G06F18/2113 , G06F18/214 , G06F18/22 , G06F40/30 , G06N3/045 , G06N3/08 , G06N5/04
Abstract: Techniques are described herein for a Name Matching Engine that integrates two Machine Learning (ML) module options. The first ML module is a feature-engineered classifier that boosts text-based name matching techniques with a binary classifier ML model. The feature-engineered classifier comprises a first stage of text-based candidate finding, and a second stage in which a binary classifier model predicts whether each string, of the candidate match list, is a match or not. The binary classifier model is based on features from two or more of: a name feature level, a word feature level, a character feature level, and an initial feature level. The second ML module of the Name Matching Engine comprises an end-to-end Recurrent Neural Network (RNN) model that directly accepts name strings as a sequence of n-grams and generates learned text embeddings. The text embeddings of matching name strings are close to each other in the feature space.
-
公开(公告)号:US20210287069A1
公开(公告)日:2021-09-16
申请号:US16989306
申请日:2020-08-10
Applicant: ORACLE INTERNATIONAL CORPORATION
Inventor: Aras Mumcuyan , Iraklis Psaroudakis , Miroslav Cepek , Rhicheek Patra
IPC: G06N3/04 , G06F16/903 , G06K9/62 , G06N5/04 , G06F40/30
Abstract: Techniques are described herein for a Name Matching Engine that integrates two Machine Learning (ML) module options. The first ML module is a feature-engineered classifier that boosts text-based name matching techniques with a binary classifier ML model. The feature-engineered classifier comprises a first stage of text-based candidate finding, and a second stage in which a binary classifier model predicts whether each string, of the candidate match list, is a match or not. The binary classifier model is based on features from two or more of: a name feature level, a word feature level, a character feature level, and an initial feature level. The second ML module of the Name Matching Engine comprises an end-to-end Recurrent Neural Network (45RNN) model that directly accepts name strings as a sequence of n-grams and generates learned text embeddings. The text embeddings of matching name strings are close to each other in the feature space.
-
-