DETERMINING CONFIDENT DATA SAMPLES FOR MACHINE LEARNING MODELS ON UNSEEN DATA

    公开(公告)号:US20200349434A1

    公开(公告)日:2020-11-05

    申请号:US16934650

    申请日:2020-07-21

    Abstract: Techniques are provided for determining confident data samples for machine learning (ML) models on unseen data. In one embodiment, a method is provided that comprises extracting, by a system comprising a processor, a feature vector for a data sample based on projection of the data sample onto a standard feature space. The method further comprises processing, by the system, the feature vector using an outlier detection model to determine whether the data sample is within a scope of a training dataset used to train a machine learning model, wherein the outlier detection model was trained using features extracted from the training dataset based on projection of data samples included in the training dataset onto the standard feature space.

    Determining confident data samples for machine learning models on unseen data

    公开(公告)号:US11593650B2

    公开(公告)日:2023-02-28

    申请号:US16934650

    申请日:2020-07-21

    Abstract: Techniques are provided for determining confident data samples for machine learning (ML) models on unseen data. In one embodiment, a method is provided that comprises extracting, by a system comprising a processor, a feature vector for a data sample based on projection of the data sample onto a standard feature space. The method further comprises processing, by the system, the feature vector using an outlier detection model to determine whether the data sample is within a scope of a training dataset used to train a machine learning model, wherein the outlier detection model was trained using features extracted from the training dataset based on projection of data samples included in the training dataset onto the standard feature space.

Patent Agency Ranking