Label shift detection and adjustment in predictive modeling

    公开(公告)号:US11599746B2

    公开(公告)日:2023-03-07

    申请号:US16916706

    申请日:2020-06-30

    Abstract: Techniques for detecting label shift and adjusting training data of predictive models in response are provided. In an embodiment, a first machine-learned model is used to generate a predicted label for each of multiple scoring instances. The first machine-learned model is trained using one or more machine learning techniques based on a plurality of training instances, each of which includes an observed label. In response to detecting a shift in observed labels, for each segment of one or more segments in multiple segments, a portion of training data that corresponds to the segment is identified. For each training instance in a subset of the portion of training data, the training instance is adjusted. The adjusted training instance is added to a final set of training data. The machine learning technique(s) are used to train a second machine-learned model based on the final set of training data.

    LABEL SHIFT DETECTION AND ADJUSTMENT IN PREDICTIVE MODELING

    公开(公告)号:US20210406598A1

    公开(公告)日:2021-12-30

    申请号:US16916706

    申请日:2020-06-30

    Abstract: Techniques for detecting label shift and adjusting training data of predictive models in response are provided. In an embodiment, a first machine-learned model is used to generate a predicted label for each of multiple scoring instances. The first machine-learned model is trained using one or more machine learning techniques based on a plurality of training instances, each of which includes an observed label. In response to detecting a shift in observed labels, for each segment of one or more segments in multiple segments, a portion of training data that corresponds to the segment is identified. For each training instance in a subset of the portion of training data, the training instance is adjusted. The adjusted training instance is added to a final set of training data. The machine learning technique(s) are used to train a second machine-learned model based on the final set of training data.

Patent Agency Ranking