TRAINING GIANT NEURAL NETWORKS USING PIPELINE PARALLELISM

    公开(公告)号:US20220121945A1

    公开(公告)日:2022-04-21

    申请号:US17567740

    申请日:2022-01-03

    申请人: Google LLC

    IPC分类号: G06N3/08 G06N3/04

    摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training giant neural networks. One of the methods includes obtaining data specifying a partitioning of the neural network into N composite layers that form a sequence of composite layers, wherein each composite layer comprises a distinct plurality of layers from the multiple network layers of the neural network; obtaining data assigning each of the N composite layers to one or more computing devices from a set of N computing devices; partitioning a mini-batch of training examples into a plurality of micro-batches; and training the neural network, comprising: performing a forward pass through the neural network until output activations have been computed for each micro-batch for a final composite layer in the sequence, and performing a backward pass through the neural network until output gradients have been computed for each micro-batch for the first composite layer in the sequence.

    Regularized neural network architecture search

    公开(公告)号:US11144831B2

    公开(公告)日:2021-10-12

    申请号:US16906034

    申请日:2020-06-19

    申请人: Google LLC

    摘要: A method for receiving training data for training a neural network (NN) to perform a machine learning (ML) task and for determining, using the training data, an optimized NN architecture for performing the ML task is described. Determining the optimized NN architecture includes: maintaining population data comprising, for each candidate architecture in a population of candidate architectures, (i) data defining the candidate architecture, and (ii) data specifying how recently a neural network having the candidate architecture has been trained while determining the optimized neural network architecture; and repeatedly performing multiple operations using each of a plurality of worker computing units to generate a new candidate architecture based on a selected candidate architecture having the best measure of fitness, adding the new candidate architecture to the population, and removing from the population the candidate architecture that was trained least recently.

    REGULARIZED NEURAL NETWORK ARCHITECTURE SEARCH

    公开(公告)号:US20200320399A1

    公开(公告)日:2020-10-08

    申请号:US16906034

    申请日:2020-06-19

    申请人: Google LLC

    IPC分类号: G06N3/08 G06N3/04

    摘要: A method for receiving training data for training a neural network (NN) to perform a machine learning (ML) task and for determining, using the training data, an optimized NN architecture for performing the ML task is described. Determining the optimized NN architecture includes: maintaining population data comprising, for each candidate architecture in a population of candidate architectures, (i) data defining the candidate architecture, and (ii) data specifying how recently a neural network having the candidate architecture has been trained while determining the optimized neural network architecture; and repeatedly performing multiple operations using each of a plurality of worker computing units to generate a new candidate architecture based on a selected candidate architecture having the best measure of fitness, adding the new candidate architecture to the population, and removing from the population the candidate architecture that was trained least recently.

    REGULARIZED NEURAL NETWORK ARCHITECTURE SEARCH

    公开(公告)号:US20230259784A1

    公开(公告)日:2023-08-17

    申请号:US18140442

    申请日:2023-04-27

    申请人: Google LLC

    IPC分类号: G06N3/086 G06N3/04

    CPC分类号: G06N3/086 G06N3/04

    摘要: A method for receiving training data for training a neural network (NN) to perform a machine learning (ML) task and for determining, using the training data, an optimized NN architecture for performing the ML task is described. Determining the optimized NN architecture includes: maintaining population data comprising, for each candidate architecture in a population of candidate architectures, (i) data defining the candidate architecture, and (ii) data specifying how recently a neural network having the candidate architecture has been trained while determining the optimized neural network architecture; and repeatedly performing multiple operations using each of a plurality of worker computing units to generate a new candidate architecture based on a selected candidate architecture having the best measure of fitness, adding the new candidate architecture to the population, and removing from the population the candidate architecture that was trained least recently.

    Regularized neural network architecture search

    公开(公告)号:US11669744B2

    公开(公告)日:2023-06-06

    申请号:US17475137

    申请日:2021-09-14

    申请人: Google LLC

    IPC分类号: G06N3/086 G06N3/04

    CPC分类号: G06N3/086 G06N3/04

    摘要: A method for receiving training data for training a neural network (NN) to perform a machine learning (ML) task and for determining, using the training data, an optimized NN architecture for performing the ML task is described. Determining the optimized NN architecture includes: maintaining population data comprising, for each candidate architecture in a population of candidate architectures, (i) data defining the candidate architecture, and (ii) data specifying how recently a neural network having the candidate architecture has been trained while determining the optimized neural network architecture; and repeatedly performing multiple operations using each of a plurality of worker computing units to generate a new candidate architecture based on a selected candidate architecture having the best measure of fitness, adding the new candidate architecture to the population, and removing from the population the candidate architecture that was trained least recently.

    REGULARIZED NEURAL NETWORK ARCHITECTURE SEARCH

    公开(公告)号:US20220004879A1

    公开(公告)日:2022-01-06

    申请号:US17475137

    申请日:2021-09-14

    申请人: Google LLC

    IPC分类号: G06N3/08 G06N3/04

    摘要: A method for receiving training data for training a neural network (NN) to perform a machine learning (ML) task and for determining, using the training data, an optimized NN architecture for performing the ML task is described. Determining the optimized NN architecture includes: maintaining population data comprising, for each candidate architecture in a population of candidate architectures, (i) data defining the candidate architecture, and (ii) data specifying how recently a neural network having the candidate architecture has been trained while determining the optimized neural network architecture; and repeatedly performing multiple operations using each of a plurality of worker computing units to generate a new candidate architecture based on a selected candidate architecture having the best measure of fitness, adding the new candidate architecture to the population, and removing from the population the candidate architecture that was trained least recently.