NUMERIC EMBEDDINGS FOR ENTITY-MATCHING

    公开(公告)号:US20220391414A1

    公开(公告)日:2022-12-08

    申请号:US17375720

    申请日:2021-07-14

    Applicant: SAP SE

    Abstract: Pairwise entity matching systems and methods are disclosed herein. A deep learning model may be used to match entities from separate data tables. Entities may be preprocessed to fuse textual and numeric data early in the neural network architecture. Numeric data may be represented as a vector of a geometrically progressing function. By fusing textual and numeric data, including dates, early in the neural network architecture the neural network may better learn the relationships between the numeric and textual data. Once preprocessed, the paired entities may be scored and matched using a neural network.

    Numeric embeddings for entity-matching

    公开(公告)号:US11615120B2

    公开(公告)日:2023-03-28

    申请号:US17375720

    申请日:2021-07-14

    Applicant: SAP SE

    Abstract: Pairwise entity matching systems and methods are disclosed herein. A deep learning model may be used to match entities from separate data tables. Entities may be preprocessed to fuse textual and numeric data early in the neural network architecture. Numeric data may be represented as a vector of a geometrically progressing function. By fusing textual and numeric data, including dates, early in the neural network architecture the neural network may better learn the relationships between the numeric and textual data. Once preprocessed, the paired entities may be scored and matched using a neural network.

    EFFICIENT SEARCH FOR COMBINATIONS OF MATCHING ENTITIES GIVEN CONSTRAINTS

    公开(公告)号:US20230222147A1

    公开(公告)日:2023-07-13

    申请号:US17647477

    申请日:2022-01-10

    Applicant: SAP SE

    CPC classification number: G06F16/3347 G06F16/3346 G06F16/325 G06F16/35

    Abstract: Methods, systems, and computer-readable storage media for receiving a set of inference results generated by a ML model, the inference results including a set of query entities and a set of target entities, each query entity having one or more target entities matched thereto by the ML model, processing the set of inference results to generate a set of matched sub-sets of target entities by executing a search over target entities in the set of target entities based on constraints, for each problem in a set of problems, providing the problem as a tuple including an index value representative of a target entity in the set of target entities and a value associated with the query entity, the value including a constraint relative to the query entity, and executing at least one task in response to one or more matched sub-sets in the set of matched sub-sets.

    ROBUST ENTITY MATCHING USING MACHINE LEARNING MODELS TRAINED ON HISTORICAL CUSTOMER DATA

    公开(公告)号:US20240185091A1

    公开(公告)日:2024-06-06

    申请号:US18074574

    申请日:2022-12-05

    Applicant: SAP SE

    CPC classification number: G06N5/022 G06Q30/0201

    Abstract: Disclosed herein are system, method, and computer program product embodiments for dropping or replacing data from datasets and training ML models to avoid overfitting in training data. An embodiment operates by generating a first set of data, wherein the first set of data may include a first plurality of entities. The first set of data may be modified by processing the first set of data, which results in a second set of data. The second set of data may include a second plurality of entities. The second set of data may be extracted to be used in a machine learning (ML) process based at least in part on at least one ML model. The second set of data may be trained on at least one ML model. A third set of data may be predicted based on the at least one ML model. The third set of data may include a third plurality of entities. The first, second, and third plurality of entities may be classified by a class.

    Efficient search for combinations of matching entities given constraints

    公开(公告)号:US11687575B1

    公开(公告)日:2023-06-27

    申请号:US17647477

    申请日:2022-01-10

    Applicant: SAP SE

    CPC classification number: G06F16/3347 G06F16/325 G06F16/3346 G06F16/35

    Abstract: Methods, systems, and computer-readable storage media for receiving a set of inference results generated by a ML model, the inference results including a set of query entities and a set of target entities, each query entity having one or more target entities matched thereto by the ML model, processing the set of inference results to generate a set of matched sub-sets of target entities by executing a search over target entities in the set of target entities based on constraints, for each problem in a set of problems, providing the problem as a tuple including an index value representative of a target entity in the set of target entities and a value associated with the query entity, the value including a constraint relative to the query entity, and executing at least one task in response to one or more matched sub-sets in the set of matched sub-sets.

Patent Agency Ranking