-
公开(公告)号:US20190205794A1
公开(公告)日:2019-07-04
申请号:US15858001
申请日:2017-12-29
Applicant: Oath Inc.
Inventor: Francis Hsu , Mridul Jain , Saurabh Tewari
CPC classification number: G06N20/00 , G06F16/285 , G06N5/022
Abstract: The present teaching relates to a method and system for validating labels of training data. A first group of data records associated with the training data are received, wherein each of the first group of data records includes a vector having at least one feature and a first label. For each of the first group of data records, a second label is determined based on the at least one feature in accordance with a first model. Thereafter, a loss based on the first label associated with the data record and the second label is obtained, and the data record having an incorrect first label is classified when the loss meets a pre-determined criterion. Upon classifying the data records, a sub-group of the first group of data records is generated, wherein each of the data records included in the sub-group has the incorrect first label.
-
公开(公告)号:US11238365B2
公开(公告)日:2022-02-01
申请号:US15858001
申请日:2017-12-29
Applicant: Oath Inc.
Inventor: Francis Hsu , Mridul Jain , Saurabh Tewari
Abstract: The present teaching relates to a method and system for validating labels of training data. A first group of data records associated with the training data are received, wherein each of the first group of data records includes a vector having at least one feature and a first label. For each of the first group of data records, a second label is determined based on the at least one feature in accordance with a first model. Thereafter, a loss based on the first label associated with the data record and the second label is obtained, and the data record having an incorrect first label is classified when the loss meets a pre-determined criterion. Upon classifying the data records, a sub-group of the first group of data records is generated, wherein each of the data records included in the sub-group has the incorrect first label.
-