DYNAMIC CALIBRATION OF CONFIDENCE-ACCURACY MAPPINGS IN ENTITY MATCHING MODELS

    公开(公告)号:US20230214456A1

    公开(公告)日:2023-07-06

    申请号:US17646886

    申请日:2022-01-04

    Applicant: SAP SE

    CPC classification number: G06K9/6265

    Abstract: Methods, systems, and computer-readable storage media for receiving a first set of predictions generated by a ML model during execution of a training pipeline to train the ML model, each prediction in the first set of predictions being associated with a confidence, determining a set of confidence bins based on confidences of the first set of predictions, for each confidence bin in the set of confidence bins, providing an accuracy, processing the set of confidence bins and accuracies through a regression model to provide one or more regressions, each regression representing a confidence-to-accuracy relationship, defining a set of confidence thresholds based on at least one regression of the one or more regressions, and during an inference phase, applying the set of confidence thresholds to selectively filter predictions from a second set of predictions generated by the ML model.

    INCREMENTAL TRAINING FOR REAL-TIME MODEL PREFORMANCE ENHANCEMENT

    公开(公告)号:US20230128485A1

    公开(公告)日:2023-04-27

    申请号:US17452441

    申请日:2021-10-27

    Applicant: SAP SE

    Abstract: Methods, systems, and computer-readable storage media for receiving IRF data sets, the IRF data sets including a set of records including inference results determined by the ML model during production use of the ML model and at least one correction to an inference result, executing incremental training of the ML model to provide an updated ML model by selectively filtering one or more records of the set of records to adjust a negative sample to positive sample proportion of a sub-set of records based on a negative sample to positive sample proportion of initial training of the ML model, for each record in the sub-set of records, determining a weight, and during incremental training, applying the weight of a respective record being in a loss function in determining an accuracy of the ML model based on the respective record, and deploying the updated ML model for production use.

Patent Agency Ranking