-
1.
公开(公告)号:US20240185091A1
公开(公告)日:2024-06-06
申请号:US18074574
申请日:2022-12-05
Applicant: SAP SE
Inventor: Stefan Klaus Baur , Matthias Frank , Hoang-Vu Nguyen
IPC: G06N5/022 , G06Q30/0201
CPC classification number: G06N5/022 , G06Q30/0201
Abstract: Disclosed herein are system, method, and computer program product embodiments for dropping or replacing data from datasets and training ML models to avoid overfitting in training data. An embodiment operates by generating a first set of data, wherein the first set of data may include a first plurality of entities. The first set of data may be modified by processing the first set of data, which results in a second set of data. The second set of data may include a second plurality of entities. The second set of data may be extracted to be used in a machine learning (ML) process based at least in part on at least one ML model. The second set of data may be trained on at least one ML model. A third set of data may be predicted based on the at least one ML model. The third set of data may include a third plurality of entities. The first, second, and third plurality of entities may be classified by a class.
-
2.
公开(公告)号:US20240045890A1
公开(公告)日:2024-02-08
申请号:US17817388
申请日:2022-08-04
Applicant: SAP SE
Inventor: Hoang-Vu Nguyen , Li Rong Wang , Matthias Frank , Rajesh Vellore Arumugam , Stefan Klaus Baur , Sundeep Gullapudi
IPC: G06F16/28 , G06N20/00 , G06F16/2457 , G06K9/62
CPC classification number: G06F16/288 , G06N20/00 , G06F16/24578 , G06K9/6215
Abstract: Methods, systems, and computer-readable storage media for a machine learning (ML) system for matching a query entity to one or more target entities, the ML system that reducing a number of query-target entity pairs from consideration as potential matches during inference.
-
公开(公告)号:US11687575B1
公开(公告)日:2023-06-27
申请号:US17647477
申请日:2022-01-10
Applicant: SAP SE
Inventor: Hoang-Vu Nguyen , Rajesh Vellore Arumugam , Matthias Frank , Stefan Klaus Baur
CPC classification number: G06F16/3347 , G06F16/325 , G06F16/3346 , G06F16/35
Abstract: Methods, systems, and computer-readable storage media for receiving a set of inference results generated by a ML model, the inference results including a set of query entities and a set of target entities, each query entity having one or more target entities matched thereto by the ML model, processing the set of inference results to generate a set of matched sub-sets of target entities by executing a search over target entities in the set of target entities based on constraints, for each problem in a set of problems, providing the problem as a tuple including an index value representative of a target entity in the set of target entities and a value associated with the query entity, the value including a constraint relative to the query entity, and executing at least one task in response to one or more matched sub-sets in the set of matched sub-sets.
-
4.
公开(公告)号:US20250117663A1
公开(公告)日:2025-04-10
申请号:US18480635
申请日:2023-10-04
Applicant: SAP SE
Inventor: Rajesh Vellore Arumugam , Donglin Ruan , Matthias Frank , Yi Quan Zhou
IPC: G06N3/0895
Abstract: Methods, systems, and computer-readable storage media for training a global matching ML model using a set of enterprise data associated with a set of enterprises, receiving a subset of enterprise data associated with an enterprise that is absent from the set of enterprises, fine tuning the global matching ML model using the subset of enterprise data to provide a fine-tuned matching ML model. deploying the fine-tuned matching ML model for inference, receiving feedback to one or more inference results generated by the fine-tuned matching ML model, receiving synthetic data from a LLM system in response to at least a portion of the feedback, and fine tuning one or more of the global matching ML model and the fine-tuned ML model using the synthetic data.
-
公开(公告)号:US20250068965A1
公开(公告)日:2025-02-27
申请号:US18455775
申请日:2023-08-25
Applicant: SAP SE
Inventor: Matthias Frank , Sundeep Gullapudi , Rajesh Vellore Arumugam , Anantharaman Ravi , Prawira Putra Fadjar , Yi Quan Zhou
Abstract: Methods, systems, and computer-readable storage media for receiving a real data table, providing a synthetic structured table based on the real data table, providing a sampled data table comprising a sub-set of real data of the real data table, transmitting a prompt to a LLM system, the prompt being generated based on the real data table and the synthetic structured data table, receiving synthetic unstructured data from the LLM system, providing an aggregate synthetic table that includes at least a portion of the synthetic unstructured data, and training a ML model using the aggregate synthetic table.
-
公开(公告)号:US20230325708A1
公开(公告)日:2023-10-12
申请号:US17718850
申请日:2022-04-12
Applicant: SAP SE
Inventor: Stefan Klaus Baur , Matthias Frank , Hoang-Vu Nguyen , Kannan Presanna Kumar
IPC: G06N20/00 , G06V10/74 , G06V10/766
CPC classification number: G06N20/00 , G06V10/761 , G06V10/766
Abstract: Computer-readable media, methods, and systems are disclosed for feature attribution in a machine learning model. Samples may be generated for a machine learning model based on a normalized probability distribution. The samples may be used to determine a weight for features and feature pairs for the machine learning model. The weights of the features and feature pairs may be used to determine which features are significant for predictions within the machine learning model.
-
公开(公告)号:US20230222147A1
公开(公告)日:2023-07-13
申请号:US17647477
申请日:2022-01-10
Applicant: SAP SE
Inventor: Hoang-Vu Nguyen , Rajesh Vellore Arumugam , Matthias Frank , Stefan Klaus Baur
CPC classification number: G06F16/3347 , G06F16/3346 , G06F16/325 , G06F16/35
Abstract: Methods, systems, and computer-readable storage media for receiving a set of inference results generated by a ML model, the inference results including a set of query entities and a set of target entities, each query entity having one or more target entities matched thereto by the ML model, processing the set of inference results to generate a set of matched sub-sets of target entities by executing a search over target entities in the set of target entities based on constraints, for each problem in a set of problems, providing the problem as a tuple including an index value representative of a target entity in the set of target entities and a value associated with the query entity, the value including a constraint relative to the query entity, and executing at least one task in response to one or more matched sub-sets in the set of matched sub-sets.
-
8.
公开(公告)号:US20250077773A1
公开(公告)日:2025-03-06
申请号:US18358225
申请日:2023-07-25
Applicant: SAP SE
Inventor: Rajesh Vellore Arumugam , Anantharaman Ravi , Matthias Frank , Sundeep Gullapudi , Yi Quan Zhou
IPC: G06F40/284 , G06F16/248 , G06F40/40
Abstract: Methods, systems, and computer-readable storage media for receiving, by an entity matching ML model, a query and target pair including a query entity and a target entity, providing, by the entity matching ML model, a query-target prediction by processing the query entity and the target entity, the query-target prediction indicating a match type between the query entity and the target entity, generating a prompt by populating a prompt template with at least a portion of the query-target prediction, inputting the prompt into a large language model (LLM), and receiving, from the LLM, an explanation that is responsive to the prompt and that describes one or more reasons for the query-target prediction output by the entity matching ML model.
-
公开(公告)号:US20220391414A1
公开(公告)日:2022-12-08
申请号:US17375720
申请日:2021-07-14
Applicant: SAP SE
Inventor: Stefan Klaus Baur , Matthias Frank , Hoang-Vu Nguyen
Abstract: Pairwise entity matching systems and methods are disclosed herein. A deep learning model may be used to match entities from separate data tables. Entities may be preprocessed to fuse textual and numeric data early in the neural network architecture. Numeric data may be represented as a vector of a geometrically progressing function. By fusing textual and numeric data, including dates, early in the neural network architecture the neural network may better learn the relationships between the numeric and textual data. Once preprocessed, the paired entities may be scored and matched using a neural network.
-
10.
公开(公告)号:US12277148B2
公开(公告)日:2025-04-15
申请号:US17723586
申请日:2022-04-19
Applicant: SAP SE
Inventor: Sundeep Gullapudi , Rajesh Vellore Arumugam , Matthias Frank , Wei Xia
IPC: G06F16/31 , G06F16/33 , G06F16/332 , G06F16/3332 , G06F40/284
Abstract: Methods, systems, and computer-readable storage media for a ML system that reduces a number of target items from consideration as potential matches to a query item using token embeddings and a search tree.
-
-
-
-
-
-
-
-
-