TRAINING NEURAL NETWORKS USING LEARNED OPTIMIZERS

    公开(公告)号:US20240256865A1

    公开(公告)日:2024-08-01

    申请号:US18430586

    申请日:2024-02-01

    Applicant: Google LLC

    CPC classification number: G06N3/08 G06N3/0455

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training neural networks. One of the methods for training a neural network configured to perform a machine learning task includes performing, at each of a plurality of iterations: performing a training step to obtain respective new gradients of a loss function; for each network parameter: generating an optimizer network input; processing the optimizer network input using an optimizer neural network, wherein the processing comprises, for each cell: generating a cell input for the cell; and processing the cell input for the cell to generate a cell output, wherein the processing comprises: obtaining latent embeddings from the cell input; generating the cell output from the hidden state; and determining an update to the hidden state; and generating an optimizer network output defining an update for the network parameter; and applying the update to the network parameter.

Patent Agency Ranking