-
1.
公开(公告)号:US20240045890A1
公开(公告)日:2024-02-08
申请号:US17817388
申请日:2022-08-04
Applicant: SAP SE
Inventor: Hoang-Vu Nguyen , Li Rong Wang , Matthias Frank , Rajesh Vellore Arumugam , Stefan Klaus Baur , Sundeep Gullapudi
IPC: G06F16/28 , G06N20/00 , G06F16/2457 , G06K9/62
CPC classification number: G06F16/288 , G06N20/00 , G06F16/24578 , G06K9/6215
Abstract: Methods, systems, and computer-readable storage media for a machine learning (ML) system for matching a query entity to one or more target entities, the ML system that reducing a number of query-target entity pairs from consideration as potential matches during inference.
-
公开(公告)号:US20250068965A1
公开(公告)日:2025-02-27
申请号:US18455775
申请日:2023-08-25
Applicant: SAP SE
Inventor: Matthias Frank , Sundeep Gullapudi , Rajesh Vellore Arumugam , Anantharaman Ravi , Prawira Putra Fadjar , Yi Quan Zhou
Abstract: Methods, systems, and computer-readable storage media for receiving a real data table, providing a synthetic structured table based on the real data table, providing a sampled data table comprising a sub-set of real data of the real data table, transmitting a prompt to a LLM system, the prompt being generated based on the real data table and the synthetic structured data table, receiving synthetic unstructured data from the LLM system, providing an aggregate synthetic table that includes at least a portion of the synthetic unstructured data, and training a ML model using the aggregate synthetic table.
-
3.
公开(公告)号:US20250077773A1
公开(公告)日:2025-03-06
申请号:US18358225
申请日:2023-07-25
Applicant: SAP SE
Inventor: Rajesh Vellore Arumugam , Anantharaman Ravi , Matthias Frank , Sundeep Gullapudi , Yi Quan Zhou
IPC: G06F40/284 , G06F16/248 , G06F40/40
Abstract: Methods, systems, and computer-readable storage media for receiving, by an entity matching ML model, a query and target pair including a query entity and a target entity, providing, by the entity matching ML model, a query-target prediction by processing the query entity and the target entity, the query-target prediction indicating a match type between the query entity and the target entity, generating a prompt by populating a prompt template with at least a portion of the query-target prediction, inputting the prompt into a large language model (LLM), and receiving, from the LLM, an explanation that is responsive to the prompt and that describes one or more reasons for the query-target prediction output by the entity matching ML model.
-
公开(公告)号:US20230229961A1
公开(公告)日:2023-07-20
申请号:US17646889
申请日:2022-01-04
Applicant: SAP SE
Inventor: Sundeep Gullapudi , Anantharaman Ravi , Denny Jee King Gee , Yi Qing Isaac New
IPC: G06N20/00
CPC classification number: G06N20/00
Abstract: Methods, systems, and computer-readable storage media for providing a set of heuristics representative of training data that is to be used to process a ML model through a training pipeline, the training pipeline including multiple phases, determining a set of time estimates by providing the set of heuristics as input to a training heuristics model that provides the set of time estimates as output, each time estimate in the set of time estimates indicating an estimated duration of a respective phase of the training pipeline, receiving, during processing of the ML model through the training pipeline, progress data representative of a progress of processing of the ML model, determining a set of status estimates including a status estimate for each phase of the training pipeline based on the progress data, and transmitting the set of time estimates and the set of status estimates for display.
-
公开(公告)号:US20230153382A1
公开(公告)日:2023-05-18
申请号:US17455046
申请日:2021-11-16
Applicant: SAP SE
Inventor: Sundeep Gullapudi
CPC classification number: G06K9/6256 , G06N7/005 , G06K9/6262
Abstract: Methods, systems, and computer-readable storage media for determining a set of potential probability thresholds based on a set of inference results provided by processing testing data through the ML model, for each potential probability threshold in the set of potential probability thresholds, determining an accuracy, selecting a probability threshold from the set of potential probability thresholds, processing an inference job including sets of entity pairs through the ML model to assign a label to each entity pair in the sets of entity pairs, each label being associated with a probability and including a type of multiple types, and for each entity pair having a label of one or more specified types, selectively removing an entity of the entity pair from further processing of the inference job by the ML model based on whether the probability associated with the label meets or exceeds the probability threshold.
-
6.
公开(公告)号:US12277148B2
公开(公告)日:2025-04-15
申请号:US17723586
申请日:2022-04-19
Applicant: SAP SE
Inventor: Sundeep Gullapudi , Rajesh Vellore Arumugam , Matthias Frank , Wei Xia
IPC: G06F16/31 , G06F16/33 , G06F16/332 , G06F16/3332 , G06F40/284
Abstract: Methods, systems, and computer-readable storage media for a ML system that reduces a number of target items from consideration as potential matches to a query item using token embeddings and a search tree.
-
公开(公告)号:US20250036974A1
公开(公告)日:2025-01-30
申请号:US18358245
申请日:2023-07-25
Applicant: SAP SE
Inventor: Rajesh Vellore Arumugam , Anantharaman Ravi , Isaac New Yi Qing , Sundeep Gullapudi , Yi Quan Zhou
Abstract: Methods, systems, and computer-readable storage media for providing, for a set of ML models, a set of training metrics determined using test data during a training phase, providing, for a production-use ML model, a set of inference metrics based on predictions generated by the production-use ML model, generating, by a prompt generator, a set of few-shot examples using the set of training metrics and the set of inference metrics, inputting, by the prompt generator, the set of few-shot examples to a LLM as prompts, transmitting, to the LLM a query, displaying, to a user, a recommendation that is received from the LLM and responsive to the query, receiving input from a user indicating a user-selected ML model responsive to the recommendation, and deploying a user-selected ML model to an inference runtime for production use.
-
公开(公告)号:US20240177053A1
公开(公告)日:2024-05-30
申请号:US18070598
申请日:2022-11-29
Applicant: SAP SE
Inventor: Sundeep Gullapudi , Rajesh Vellore Arumugam , Abhinandan Padhi
IPC: G06N20/00
CPC classification number: G06N20/00
Abstract: Methods, systems, and computer-readable storage media for receiving query data representative of query entities and target data representative of target entities, determining, by an attention ML model, a set of character-level embeddings, providing, by a sub-word-level tokenizer, a set of sub-word-level tokens, each sub-word-level token including a string of multiple characters, generating, by the attention ML model, a set of sub-word-level embeddings based on the set of sub-word-level tokens, providing, by the attention ML model, at least one attention matrix including attention scores, each attention score representative of a relative importance of a respective sub-word-level token in a predicted match, the predicted match including a match between a query entity and a target entity, and outputting an explanation based on the at least one attention matrix.
-
9.
公开(公告)号:US20230334070A1
公开(公告)日:2023-10-19
申请号:US17723586
申请日:2022-04-19
Applicant: SAP SE
Inventor: Sundeep Gullapudi , Rajesh Vellore Arumugam , Matthias Frank , Wei Xia
IPC: G06F16/31 , G06F16/332 , G06F16/33 , G06F40/284
CPC classification number: G06F16/322 , G06F16/332 , G06F16/3334 , G06F40/284
Abstract: Methods, systems, and computer-readable storage media for a ML system that reduces a number of target items from consideration as potential matches to a query item using token embeddings and a search tree.
-
公开(公告)号:US20230214456A1
公开(公告)日:2023-07-06
申请号:US17646886
申请日:2022-01-04
Applicant: SAP SE
Inventor: Sundeep Gullapudi , Rajesh Vellore Arumugam , Anantharaman Ravi , Prawira Putra Fadjar , Wei Xia
IPC: G06K9/62
CPC classification number: G06K9/6265
Abstract: Methods, systems, and computer-readable storage media for receiving a first set of predictions generated by a ML model during execution of a training pipeline to train the ML model, each prediction in the first set of predictions being associated with a confidence, determining a set of confidence bins based on confidences of the first set of predictions, for each confidence bin in the set of confidence bins, providing an accuracy, processing the set of confidence bins and accuracies through a regression model to provide one or more regressions, each regression representing a confidence-to-accuracy relationship, defining a set of confidence thresholds based on at least one regression of the one or more regressions, and during an inference phase, applying the set of confidence thresholds to selectively filter predictions from a second set of predictions generated by the ML model.
-
-
-
-
-
-
-
-
-