-
1.
公开(公告)号:US20240045890A1
公开(公告)日:2024-02-08
申请号:US17817388
申请日:2022-08-04
申请人: SAP SE
发明人: Hoang-Vu Nguyen , Li Rong Wang , Matthias Frank , Rajesh Vellore Arumugam , Stefan Klaus Baur , Sundeep Gullapudi
IPC分类号: G06F16/28 , G06N20/00 , G06F16/2457 , G06K9/62
CPC分类号: G06F16/288 , G06N20/00 , G06F16/24578 , G06K9/6215
摘要: Methods, systems, and computer-readable storage media for a machine learning (ML) system for matching a query entity to one or more target entities, the ML system that reducing a number of query-target entity pairs from consideration as potential matches during inference.
-
公开(公告)号:US11687575B1
公开(公告)日:2023-06-27
申请号:US17647477
申请日:2022-01-10
申请人: SAP SE
CPC分类号: G06F16/3347 , G06F16/325 , G06F16/3346 , G06F16/35
摘要: Methods, systems, and computer-readable storage media for receiving a set of inference results generated by a ML model, the inference results including a set of query entities and a set of target entities, each query entity having one or more target entities matched thereto by the ML model, processing the set of inference results to generate a set of matched sub-sets of target entities by executing a search over target entities in the set of target entities based on constraints, for each problem in a set of problems, providing the problem as a tuple including an index value representative of a target entity in the set of target entities and a value associated with the query entity, the value including a constraint relative to the query entity, and executing at least one task in response to one or more matched sub-sets in the set of matched sub-sets.
-
公开(公告)号:US12093300B1
公开(公告)日:2024-09-17
申请号:US18463519
申请日:2023-09-08
申请人: SAP SE
IPC分类号: G06F7/00 , G06F16/33 , G06F16/332 , G06F16/35 , G06F40/174 , G06F40/186
CPC分类号: G06F16/35 , G06F16/3329 , G06F16/3344 , G06F40/174 , G06F40/186
摘要: Methods, systems, and computer-readable storage media for receiving a first document including structured data and unstructured data, providing a first sub-document and a second sub-document, the first sub-document including the structured data of the first document, the second sub-document including the unstructured data of the first document, generating a prompt using the second sub-document and a second document, inputting the prompt to a LLM, receiving a response from the LLM, providing a calibrated first document by merging the response into the first sub-document, and processing the calibrated first document and the second document using a ML model to provide a prediction, the prediction indicating a matching class between the first document and the second document.
-
公开(公告)号:US20230222147A1
公开(公告)日:2023-07-13
申请号:US17647477
申请日:2022-01-10
申请人: SAP SE
CPC分类号: G06F16/3347 , G06F16/3346 , G06F16/325 , G06F16/35
摘要: Methods, systems, and computer-readable storage media for receiving a set of inference results generated by a ML model, the inference results including a set of query entities and a set of target entities, each query entity having one or more target entities matched thereto by the ML model, processing the set of inference results to generate a set of matched sub-sets of target entities by executing a search over target entities in the set of target entities based on constraints, for each problem in a set of problems, providing the problem as a tuple including an index value representative of a target entity in the set of target entities and a value associated with the query entity, the value including a constraint relative to the query entity, and executing at least one task in response to one or more matched sub-sets in the set of matched sub-sets.
-
公开(公告)号:US20240177053A1
公开(公告)日:2024-05-30
申请号:US18070598
申请日:2022-11-29
申请人: SAP SE
IPC分类号: G06N20/00
CPC分类号: G06N20/00
摘要: Methods, systems, and computer-readable storage media for receiving query data representative of query entities and target data representative of target entities, determining, by an attention ML model, a set of character-level embeddings, providing, by a sub-word-level tokenizer, a set of sub-word-level tokens, each sub-word-level token including a string of multiple characters, generating, by the attention ML model, a set of sub-word-level embeddings based on the set of sub-word-level tokens, providing, by the attention ML model, at least one attention matrix including attention scores, each attention score representative of a relative importance of a respective sub-word-level token in a predicted match, the predicted match including a match between a query entity and a target entity, and outputting an explanation based on the at least one attention matrix.
-
6.
公开(公告)号:US20230334070A1
公开(公告)日:2023-10-19
申请号:US17723586
申请日:2022-04-19
申请人: SAP SE
IPC分类号: G06F16/31 , G06F16/332 , G06F16/33 , G06F40/284
CPC分类号: G06F16/322 , G06F16/332 , G06F16/3334 , G06F40/284
摘要: Methods, systems, and computer-readable storage media for a ML system that reduces a number of target items from consideration as potential matches to a query item using token embeddings and a search tree.
-
公开(公告)号:US20230214456A1
公开(公告)日:2023-07-06
申请号:US17646886
申请日:2022-01-04
申请人: SAP SE
发明人: Sundeep Gullapudi , Rajesh Vellore Arumugam , Anantharaman Ravi , Prawira Putra Fadjar , Wei Xia
IPC分类号: G06K9/62
CPC分类号: G06K9/6265
摘要: Methods, systems, and computer-readable storage media for receiving a first set of predictions generated by a ML model during execution of a training pipeline to train the ML model, each prediction in the first set of predictions being associated with a confidence, determining a set of confidence bins based on confidences of the first set of predictions, for each confidence bin in the set of confidence bins, providing an accuracy, processing the set of confidence bins and accuracies through a regression model to provide one or more regressions, each regression representing a confidence-to-accuracy relationship, defining a set of confidence thresholds based on at least one regression of the one or more regressions, and during an inference phase, applying the set of confidence thresholds to selectively filter predictions from a second set of predictions generated by the ML model.
-
公开(公告)号:US20230128485A1
公开(公告)日:2023-04-27
申请号:US17452441
申请日:2021-10-27
申请人: SAP SE
IPC分类号: G06N20/00
摘要: Methods, systems, and computer-readable storage media for receiving IRF data sets, the IRF data sets including a set of records including inference results determined by the ML model during production use of the ML model and at least one correction to an inference result, executing incremental training of the ML model to provide an updated ML model by selectively filtering one or more records of the set of records to adjust a negative sample to positive sample proportion of a sub-set of records based on a negative sample to positive sample proportion of initial training of the ML model, for each record in the sub-set of records, determining a weight, and during incremental training, applying the weight of a respective record being in a loss function in determining an accuracy of the ML model based on the respective record, and deploying the updated ML model for production use.
-
-
-
-
-
-
-