-
公开(公告)号:US20240127586A1
公开(公告)日:2024-04-18
申请号:US18275087
申请日:2022-02-02
Applicant: DeepMind Technologies Limited
Inventor: Andrew Brock , Soham De , Samuel Laurence Smith , Karen Simonyan
IPC: G06V10/82 , G06V10/776
CPC classification number: G06V10/82 , G06V10/776
Abstract: There is disclosed a computer-implemented method for training a neural network. The method comprises determining a gradient associated with a parameter of the neural network. The method further comprises determining a ratio of a gradient norm to parameter norm and comparing the ratio to a threshold. In response to determining that the ratio exceeds the threshold, the value of the gradient is reduced such that the ratio is equal to or below the threshold. The value of the parameter is updated based upon the reduced gradient value.
-
公开(公告)号:US11775830B2
公开(公告)日:2023-10-03
申请号:US18079791
申请日:2022-12-12
Applicant: DeepMind Technologies Limited
Inventor: Chongli Qin , Sven Adrian Gowal , Soham De , Robert Stanforth , James Martens , Krishnamurthy Dvijotham , Dilip Krishnan , Alhussein Fawzi
IPC: G06N3/08 , G06V10/82 , G06F18/214 , G06F18/2135 , G06V10/764 , G06V10/774
CPC classification number: G06N3/08 , G06F18/214 , G06F18/21355 , G06V10/764 , G06V10/774 , G06V10/82
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes processing each training input using the neural network and in accordance with the current values of the network parameters to generate a network output for the training input; computing a respective loss for each of the training inputs by evaluating a loss function; identifying, from a plurality of possible perturbations, a maximally non-linear perturbation; and determining an update to the current values of the parameters of the neural network by performing an iteration of a neural network training procedure to decrease the respective losses for the training inputs and to decrease the non-linearity of the loss function for the identified maximally non-linear perturbation.
-
公开(公告)号:US20230252286A1
公开(公告)日:2023-08-10
申请号:US18079791
申请日:2022-12-12
Applicant: DeepMind Technologies Limited
Inventor: Chongli Qin , Sven Adrian Gowal , Soham De , Robert Stanforth , James Martens , Krishnamurthy Dvijotham , Dilip Krishnan , Alhussein Fawzi
IPC: G06N3/08 , G06V10/82 , G06F18/214 , G06F18/2135 , G06V10/764 , G06V10/774
CPC classification number: G06N3/08 , G06V10/82 , G06F18/214 , G06F18/21355 , G06V10/764 , G06V10/774
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes processing each training input using the neural network and in accordance with the current values of the network parameters to generate a network output for the training input; computing a respective loss for each of the training inputs by evaluating a loss function; identifying, from a plurality of possible perturbations, a maximally non-linear perturbation; and determining an update to the current values of the parameters of the neural network by performing an iteration of a neural network training procedure to decrease the respective losses for the training inputs and to decrease the non-linearity of the loss function for the identified maximally non-linear perturbation.
-
公开(公告)号:US20230351042A1
公开(公告)日:2023-11-02
申请号:US18141273
申请日:2023-04-28
Applicant: DeepMind Technologies Limited
Inventor: Soham De , Borja De Balle Pigem , Jamie Hayes , Samuel Laurence Smith , Leonard Alix Jean Eric Berrada Lancrey Javal
IPC: G06F21/62
CPC classification number: G06F21/6245
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for privacy-sensitive training of a neural network. In one aspect, a method includes training a set of neural network parameters of the neural network on a set of training data over multiple training iterations to optimize an objective function. Each training iteration includes: sampling a batch of network inputs from the set of training data; determining a clipped gradient for each network input in the batch of network inputs; and updating the neural network parameters using the clipped gradients for the network inputs in the batch of network inputs.
-
公开(公告)号:US11526755B2
公开(公告)日:2022-12-13
申请号:US16882332
申请日:2020-05-22
Applicant: DeepMind Technologies Limited
Inventor: Chongli Qin , Sven Adrian Gowal , Soham De , Robert Stanforth , James Martens , Krishnamurthy Dvijotham , Dilip Krishnan , Alhussein Fawzi
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes processing each training input using the neural network and in accordance with the current values of the network parameters to generate a network output for the training input; computing a respective loss for each of the training inputs by evaluating a loss function; identifying, from a plurality of possible perturbations, a maximally non-linear perturbation; and determining an update to the current values of the parameters of the neural network by performing an iteration of a neural network training procedure to decrease the respective losses for the training inputs and to decrease the non-linearity of the loss function for the identified maximally non-linear perturbation.
-
-
-
-