OPTIMIZED LATENT MISSING FEATURE DETECTION FOR MACHINE LEARNING MODELS

    公开(公告)号:US20240265304A1

    公开(公告)日:2024-08-08

    申请号:US18336538

    申请日:2023-06-16

    申请人: Optum, Inc.

    IPC分类号: G06N20/00

    CPC分类号: G06N20/00

    摘要: Various embodiments of the present disclosure provide techniques for optimally augmenting a training dataset for a machine learning model based on multiple model-focused predictions. The techniques may include generating a datapoint priority matrix that corresponds to a plurality of entity-feature value pairs of a training dataset for a machine learning model, generating a plurality of impact predictions and feature sensitivity predictions for the plurality of entity-feature value pairs, generating a refined datapoint priority matrix by updating the datapoint priority matrix based on the plurality of impact predictions and sensitivity predictions, and providing a datapoint collection output for the training dataset based on the refined datapoint priority matrix and a data augmentation threshold.