OPTIMIZED LATENT MISSING FEATURE DETECTION FOR MACHINE LEARNING MODELS

    公开(公告)号:US20240265304A1

    公开(公告)日:2024-08-08

    申请号:US18336538

    申请日:2023-06-16

    申请人: Optum, Inc.

    IPC分类号: G06N20/00

    CPC分类号: G06N20/00

    摘要: Various embodiments of the present disclosure provide techniques for optimally augmenting a training dataset for a machine learning model based on multiple model-focused predictions. The techniques may include generating a datapoint priority matrix that corresponds to a plurality of entity-feature value pairs of a training dataset for a machine learning model, generating a plurality of impact predictions and feature sensitivity predictions for the plurality of entity-feature value pairs, generating a refined datapoint priority matrix by updating the datapoint priority matrix based on the plurality of impact predictions and sensitivity predictions, and providing a datapoint collection output for the training dataset based on the refined datapoint priority matrix and a data augmentation threshold.

    DATA IMPUTATION USING AN INTERCONNECTED VARIATIONAL AUTOENCODER MODEL

    公开(公告)号:US20240169185A1

    公开(公告)日:2024-05-23

    申请号:US18446971

    申请日:2023-08-09

    申请人: Optum, Inc.

    IPC分类号: G06N3/0455

    CPC分类号: G06N3/0455

    摘要: Embodiments of the present disclosure provide for improved data processing using interconnected variational autoencoder models, which may be used for any of a myriad of purposes. Some embodiments specially train the interconnected variational autoencoder models by utilizing different training scenarios corresponding to presence and/or absence of particular data in a training data set. Particular encoder(s) and/or decoder(s) from the specially trained interconnected variational autoencoder models may then be utilized to improve accuracy of the desired data processing tasks, for example, to generate particular output data.