CODE-LEVEL NEURAL ARCHITECTURE SEARCH USING LANGUAGE MODELS

    公开(公告)号:US20240273371A1

    公开(公告)日:2024-08-15

    申请号:US18431804

    申请日:2024-02-02

    Applicant: Google LLC

    CPC classification number: G06N3/086

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining an architecture for a neural network configured to perform a machine learning task. In one aspect, a method comprises: receiving training data; searching for a final architecture of the neural network, wherein the searching comprises: maintaining current population data; and repeatedly performing evolutionary architecture search steps comprising: selecting one or more candidate architectures from the current population of candidate architectures defined by the source code included in the current population data; generating an input prompt; processing the input prompt using the language model neural network to generate output source code that defines a plurality of new candidate architectures; and using the plurality of new candidate architectures defined by the output source code to update the current population data.

    Computationally efficient neural network architecture search

    公开(公告)号:US10997503B2

    公开(公告)日:2021-05-04

    申请号:US16447866

    申请日:2019-06-20

    Applicant: Google LLC

    Abstract: A method for receiving training data for training a neural network to perform a machine learning task and for searching for, using the training data, an optimized neural network architecture for performing the machine learning task is described. Searching for the optimized neural network architecture includes: maintaining population data; maintaining threshold data; and repeatedly performing the following operations: selecting one or more candidate architectures from the population data; generating a new architecture from the one or more selected candidate architectures; for the new architecture: training a neural network having the new architecture until termination criteria for the training are satisfied; and determining a final measure of fitness of the neural network having the new architecture after the training; and adding data defining the new architecture and the final measure of fitness for the neural network having the new architecture to the population data.

    DETERMINING HYPERPARAMETERS USING SEQUENCE GENERATION NEURAL NETWORKS

    公开(公告)号:US20230401451A1

    公开(公告)日:2023-12-14

    申请号:US18199886

    申请日:2023-05-19

    Applicant: Google LLC

    CPC classification number: G06N3/0985 G06N3/0455

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a machine learning model. One of the methods includes receiving metadata for the training, generating a metadata sequence that represents the metadata, at each of a plurality of iterations: generating one or more trials that each specify a respective value for each of a set of hyperparameters, comprising, for each trial: generating an input sequence for the iteration that comprises (i) the metadata sequence and (ii) for any earlier trials, a respective sequence that represents the respective values for the hyperparameters specified by the earlier trial and a measure of performance for the trial, and processing an input sequence for the trial that comprises the input sequence for the iteration using a sequence generation neural network to generate an output sequence that represents respective values for the hyperparameters.

    COMPUTATIONALLY EFFICIENT NEURAL NETWORK ARCHITECTURE SEARCH

    公开(公告)号:US20210256390A1

    公开(公告)日:2021-08-19

    申请号:US17306813

    申请日:2021-05-03

    Applicant: Google LLC

    Abstract: A method for receiving training data for training a neural network to perform a machine learning task and for searching for, using the training data, an optimized neural network architecture for performing the machine learning task is described. Searching for the optimized neural network architecture includes: maintaining population data; maintaining threshold data; and repeatedly performing the following operations: selecting one or more candidate architectures from the population data; generating a new architecture from the one or more selected candidate architectures; for the new architecture: training a neural network having the new architecture until termination criteria for the training are satisfied; and determining a final measure of fitness of the neural network having the new architecture after the training; and adding data defining the new architecture and the final measure of fitness for the neural network having the new architecture to the population data.

Patent Agency Ranking