-
公开(公告)号:US20240256865A1
公开(公告)日:2024-08-01
申请号:US18430586
申请日:2024-02-01
Applicant: Google LLC
Inventor: Deepali Jain , Krzysztof Marcin Choromanski , Sumeet Singh , Vikas Sindhwani , Tingnan Zhang , Jie Tan , Kumar Avinava Dubey
IPC: G06N3/08 , G06N3/0455
CPC classification number: G06N3/08 , G06N3/0455
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training neural networks. One of the methods for training a neural network configured to perform a machine learning task includes performing, at each of a plurality of iterations: performing a training step to obtain respective new gradients of a loss function; for each network parameter: generating an optimizer network input; processing the optimizer network input using an optimizer neural network, wherein the processing comprises, for each cell: generating a cell input for the cell; and processing the cell input for the cell to generate a cell output, wherein the processing comprises: obtaining latent embeddings from the cell input; generating the cell output from the hidden state; and determining an update to the hidden state; and generating an optimizer network output defining an update for the network parameter; and applying the update to the network parameter.