-
公开(公告)号:US20230351192A1
公开(公告)日:2023-11-02
申请号:US18348587
申请日:2023-07-07
Applicant: Google LLC
Inventor: Zizhao Zhang , Sercan Omer Arik , Tomas Jon Pfister , Han Zhang
IPC: G06N3/084 , G06N20/00 , G06N5/04 , G06V10/762 , G06V10/771 , G06V10/774 , G06V10/776 , G06V10/82
CPC classification number: G06N3/084 , G06N20/00 , G06N5/04 , G06V10/763 , G06V10/771 , G06V10/774 , G06V10/776 , G06V10/82
Abstract: A method for training a model comprises obtaining a set of labeled training samples each associated with a given label. For each labeled training sample, the method includes generating a pseudo label and estimating a weight of the labeled training sample indicative of an accuracy of the given label. The method also includes determining whether the weight of the labeled training sample satisfies a weight threshold. When the weight of the labeled training sample satisfies the weight threshold, the method includes adding the labeled training sample to a set of cleanly labeled training samples. Otherwise, the method includes adding the labeled training sample to a set of mislabeled training samples. The method includes training the model with the set of cleanly labeled training samples using corresponding given labels and the set of mislabeled training samples using corresponding pseudo labels.
-
公开(公告)号:US20230325676A1
公开(公告)日:2023-10-12
申请号:US18333998
申请日:2023-06-13
Applicant: Google LLC
Inventor: Zizhao Zhang , Tomas Jon Pfister , Sercan Omer Arik , Mingfei Gao
IPC: G06N3/084 , G06N20/00 , G06F7/24 , G06N3/08 , G06F18/211 , G06F18/214
CPC classification number: G06N3/084 , G06N20/00 , G06F18/2155 , G06N3/08 , G06F18/211 , G06F7/24
Abstract: A method includes obtaining a set of unlabeled training samples. For each training sample in the set of unlabeled training samples generating, the method includes using a machine learning model and the training sample, a corresponding first prediction, generating, using the machine learning model and a modified unlabeled training sample, a second prediction, the modified unlabeled training sample based on the training sample, and determining a difference between the first prediction and the second prediction. The method includes selecting, based on the differences, a subset of the set of unlabeled training samples. For each training sample in the subset of the set of unlabeled training samples, the method includes obtaining a ground truth label for the training sample, and generating a corresponding labeled training sample based on the training sample paired with the ground truth label. The method includes training the machine learning model using the corresponding labeled training samples.
-
公开(公告)号:US11487970B2
公开(公告)日:2022-11-01
申请号:US17031144
申请日:2020-09-24
Applicant: Google LLC
Inventor: Sercan Omer Arik , Chen Xing , Zizhao Zhang , Tomas Jon Pfister
Abstract: A method for jointly training a classification model and a confidence model. The method includes receiving a training data set including a plurality of training data subsets. From two or more training data subsets in the training data set, the method includes selecting a support set of training examples and a query set of training examples. The method includes determining, using the classification model, a centroid value for each respective class. For each training example in the query set of training examples, the method includes generating, using the classification model, a query encoding, determining a class distance measure, determining a ground-truth distance, and updating parameters of the classification model. For each training example in the query set of training examples identified as being misclassified, the method further includes generating a standard deviation value, sampling a new query, and updating parameters of the confidence model based on the new query encoding.
-
-