METHOD AND SYSTEM FOR DETECTING ANOMALIES IN DATA LABELS

    公开(公告)号:US20190205794A1

    公开(公告)日:2019-07-04

    申请号:US15858001

    申请日:2017-12-29

    Applicant: Oath Inc.

    CPC classification number: G06N20/00 G06F16/285 G06N5/022

    Abstract: The present teaching relates to a method and system for validating labels of training data. A first group of data records associated with the training data are received, wherein each of the first group of data records includes a vector having at least one feature and a first label. For each of the first group of data records, a second label is determined based on the at least one feature in accordance with a first model. Thereafter, a loss based on the first label associated with the data record and the second label is obtained, and the data record having an incorrect first label is classified when the loss meets a pre-determined criterion. Upon classifying the data records, a sub-group of the first group of data records is generated, wherein each of the data records included in the sub-group has the incorrect first label.

    Method and system for detecting anomalies in data labels

    公开(公告)号:US11238365B2

    公开(公告)日:2022-02-01

    申请号:US15858001

    申请日:2017-12-29

    Applicant: Oath Inc.

    Abstract: The present teaching relates to a method and system for validating labels of training data. A first group of data records associated with the training data are received, wherein each of the first group of data records includes a vector having at least one feature and a first label. For each of the first group of data records, a second label is determined based on the at least one feature in accordance with a first model. Thereafter, a loss based on the first label associated with the data record and the second label is obtained, and the data record having an incorrect first label is classified when the loss meets a pre-determined criterion. Upon classifying the data records, a sub-group of the first group of data records is generated, wherein each of the data records included in the sub-group has the incorrect first label.

Patent Agency Ranking