-
公开(公告)号:US20210334459A1
公开(公告)日:2021-10-28
申请号:US17239284
申请日:2021-04-23
Applicant: DeepMind Technologies Limited
Inventor: Krishnamurthy Dvijotham , Anton Zhernov , Sven Adrian Gowal , Conrad Grobler , Robert Stanforth
IPC: G06F40/279 , G06F40/247 , G06F40/166 , G06N20/00 , G06N5/04
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a text classification machine learning model. One of the methods includes training a model having a plurality of parameters and configured to generate a classification of a text sample comprising a plurality of words by processing a model input that includes a combined feature representation of the plurality of words in the text sample, wherein the training comprises receiving a text sample and a target classification for the text sample; generating a plurality of perturbed combined feature representations; determining, based on the plurality of perturbed combined feature representations, a region in the embedding space; and determining an update to the parameters based on an adversarial objective that encourages the model to assign the target classification for the text sample for all of the combined feature representations in the region in the embedding space.
-
公开(公告)号:US11775830B2
公开(公告)日:2023-10-03
申请号:US18079791
申请日:2022-12-12
Applicant: DeepMind Technologies Limited
Inventor: Chongli Qin , Sven Adrian Gowal , Soham De , Robert Stanforth , James Martens , Krishnamurthy Dvijotham , Dilip Krishnan , Alhussein Fawzi
IPC: G06N3/08 , G06V10/82 , G06F18/214 , G06F18/2135 , G06V10/764 , G06V10/774
CPC classification number: G06N3/08 , G06F18/214 , G06F18/21355 , G06V10/764 , G06V10/774 , G06V10/82
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes processing each training input using the neural network and in accordance with the current values of the network parameters to generate a network output for the training input; computing a respective loss for each of the training inputs by evaluating a loss function; identifying, from a plurality of possible perturbations, a maximally non-linear perturbation; and determining an update to the current values of the parameters of the neural network by performing an iteration of a neural network training procedure to decrease the respective losses for the training inputs and to decrease the non-linearity of the loss function for the identified maximally non-linear perturbation.
-
公开(公告)号:US11847414B2
公开(公告)日:2023-12-19
申请号:US17239284
申请日:2021-04-23
Applicant: DeepMind Technologies Limited
Inventor: Krishnamurthy Dvijotham , Anton Zhernov , Sven Adrian Gowal , Conrad Grobler , Robert Stanforth
IPC: G06F40/279 , G06F40/247 , G06N5/04 , G06N20/00 , G06F40/166
CPC classification number: G06F40/279 , G06F40/166 , G06F40/247 , G06N5/04 , G06N20/00
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a text classification machine learning model. One of the methods includes training a model having a plurality of parameters and configured to generate a classification of a text sample comprising a plurality of words by processing a model input that includes a combined feature representation of the plurality of words in the text sample, wherein the training comprises receiving a text sample and a target classification for the text sample; generating a plurality of perturbed combined feature representations; determining, based on the plurality of perturbed combined feature representations, a region in the embedding space; and determining an update to the parameters based on an adversarial objective that encourages the model to assign the target classification for the text sample for all of the combined feature representations in the region in the embedding space.
-
公开(公告)号:US11526755B2
公开(公告)日:2022-12-13
申请号:US16882332
申请日:2020-05-22
Applicant: DeepMind Technologies Limited
Inventor: Chongli Qin , Sven Adrian Gowal , Soham De , Robert Stanforth , James Martens , Krishnamurthy Dvijotham , Dilip Krishnan , Alhussein Fawzi
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes processing each training input using the neural network and in accordance with the current values of the network parameters to generate a network output for the training input; computing a respective loss for each of the training inputs by evaluating a loss function; identifying, from a plurality of possible perturbations, a maximally non-linear perturbation; and determining an update to the current values of the parameters of the neural network by performing an iteration of a neural network training procedure to decrease the respective losses for the training inputs and to decrease the non-linearity of the loss function for the identified maximally non-linear perturbation.
-
公开(公告)号:US20230252286A1
公开(公告)日:2023-08-10
申请号:US18079791
申请日:2022-12-12
Applicant: DeepMind Technologies Limited
Inventor: Chongli Qin , Sven Adrian Gowal , Soham De , Robert Stanforth , James Martens , Krishnamurthy Dvijotham , Dilip Krishnan , Alhussein Fawzi
IPC: G06N3/08 , G06V10/82 , G06F18/214 , G06F18/2135 , G06V10/764 , G06V10/774
CPC classification number: G06N3/08 , G06V10/82 , G06F18/214 , G06F18/21355 , G06V10/764 , G06V10/774
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes processing each training input using the neural network and in accordance with the current values of the network parameters to generate a network output for the training input; computing a respective loss for each of the training inputs by evaluating a loss function; identifying, from a plurality of possible perturbations, a maximally non-linear perturbation; and determining an update to the current values of the parameters of the neural network by performing an iteration of a neural network training procedure to decrease the respective losses for the training inputs and to decrease the non-linearity of the loss function for the identified maximally non-linear perturbation.
-
-
-
-