-
1.
公开(公告)号:US20240045890A1
公开(公告)日:2024-02-08
申请号:US17817388
申请日:2022-08-04
申请人: SAP SE
发明人: Hoang-Vu Nguyen , Li Rong Wang , Matthias Frank , Rajesh Vellore Arumugam , Stefan Klaus Baur , Sundeep Gullapudi
IPC分类号: G06F16/28 , G06N20/00 , G06F16/2457 , G06K9/62
CPC分类号: G06F16/288 , G06N20/00 , G06F16/24578 , G06K9/6215
摘要: Methods, systems, and computer-readable storage media for a machine learning (ML) system for matching a query entity to one or more target entities, the ML system that reducing a number of query-target entity pairs from consideration as potential matches during inference.
-
公开(公告)号:US20230229961A1
公开(公告)日:2023-07-20
申请号:US17646889
申请日:2022-01-04
申请人: SAP SE
IPC分类号: G06N20/00
CPC分类号: G06N20/00
摘要: Methods, systems, and computer-readable storage media for providing a set of heuristics representative of training data that is to be used to process a ML model through a training pipeline, the training pipeline including multiple phases, determining a set of time estimates by providing the set of heuristics as input to a training heuristics model that provides the set of time estimates as output, each time estimate in the set of time estimates indicating an estimated duration of a respective phase of the training pipeline, receiving, during processing of the ML model through the training pipeline, progress data representative of a progress of processing of the ML model, determining a set of status estimates including a status estimate for each phase of the training pipeline based on the progress data, and transmitting the set of time estimates and the set of status estimates for display.
-
公开(公告)号:US20230153382A1
公开(公告)日:2023-05-18
申请号:US17455046
申请日:2021-11-16
申请人: SAP SE
发明人: Sundeep Gullapudi
CPC分类号: G06K9/6256 , G06N7/005 , G06K9/6262
摘要: Methods, systems, and computer-readable storage media for determining a set of potential probability thresholds based on a set of inference results provided by processing testing data through the ML model, for each potential probability threshold in the set of potential probability thresholds, determining an accuracy, selecting a probability threshold from the set of potential probability thresholds, processing an inference job including sets of entity pairs through the ML model to assign a label to each entity pair in the sets of entity pairs, each label being associated with a probability and including a type of multiple types, and for each entity pair having a label of one or more specified types, selectively removing an entity of the entity pair from further processing of the inference job by the ML model based on whether the probability associated with the label meets or exceeds the probability threshold.
-
公开(公告)号:US20240177053A1
公开(公告)日:2024-05-30
申请号:US18070598
申请日:2022-11-29
申请人: SAP SE
IPC分类号: G06N20/00
CPC分类号: G06N20/00
摘要: Methods, systems, and computer-readable storage media for receiving query data representative of query entities and target data representative of target entities, determining, by an attention ML model, a set of character-level embeddings, providing, by a sub-word-level tokenizer, a set of sub-word-level tokens, each sub-word-level token including a string of multiple characters, generating, by the attention ML model, a set of sub-word-level embeddings based on the set of sub-word-level tokens, providing, by the attention ML model, at least one attention matrix including attention scores, each attention score representative of a relative importance of a respective sub-word-level token in a predicted match, the predicted match including a match between a query entity and a target entity, and outputting an explanation based on the at least one attention matrix.
-
5.
公开(公告)号:US20230334070A1
公开(公告)日:2023-10-19
申请号:US17723586
申请日:2022-04-19
申请人: SAP SE
IPC分类号: G06F16/31 , G06F16/332 , G06F16/33 , G06F40/284
CPC分类号: G06F16/322 , G06F16/332 , G06F16/3334 , G06F40/284
摘要: Methods, systems, and computer-readable storage media for a ML system that reduces a number of target items from consideration as potential matches to a query item using token embeddings and a search tree.
-
公开(公告)号:US20230214456A1
公开(公告)日:2023-07-06
申请号:US17646886
申请日:2022-01-04
申请人: SAP SE
发明人: Sundeep Gullapudi , Rajesh Vellore Arumugam , Anantharaman Ravi , Prawira Putra Fadjar , Wei Xia
IPC分类号: G06K9/62
CPC分类号: G06K9/6265
摘要: Methods, systems, and computer-readable storage media for receiving a first set of predictions generated by a ML model during execution of a training pipeline to train the ML model, each prediction in the first set of predictions being associated with a confidence, determining a set of confidence bins based on confidences of the first set of predictions, for each confidence bin in the set of confidence bins, providing an accuracy, processing the set of confidence bins and accuracies through a regression model to provide one or more regressions, each regression representing a confidence-to-accuracy relationship, defining a set of confidence thresholds based on at least one regression of the one or more regressions, and during an inference phase, applying the set of confidence thresholds to selectively filter predictions from a second set of predictions generated by the ML model.
-
-
-
-
-