Compound model scaling for neural networks

    公开(公告)号:US11893491B2

    公开(公告)日:2024-02-06

    申请号:US17144450

    申请日:2021-01-08

    Applicant: Google LLC

    CPC classification number: G06N3/082 G06N3/04

    Abstract: A method for determining a final architecture for a neural network to perform a particular machine learning task is described. The method includes receiving a baseline architecture for the neural network, wherein the baseline architecture has a network width dimension, a network depth dimension, and a resolution dimension; receiving data defining a compound coefficient that controls extra computational resources used for scaling the baseline architecture; performing a search to determine a baseline width, depth and resolution coefficient that specify how to assign the extra computational resources to the network width, depth and resolution dimensions of the baseline architecture, respectively; determining a width, depth and resolution coefficient based on the baseline width, depth, and resolution coefficient and the compound coefficient; and generating the final architecture that scales the network width, network depth, and resolution dimensions of the baseline architecture based on the corresponding width, depth, and resolution coefficients.

    SYSTEMS AND METHODS FOR PROGRESSIVE LEARNING FOR MACHINE-LEARNED MODELS TO OPTIMIZE TRAINING SPEED

    公开(公告)号:US20220245928A1

    公开(公告)日:2022-08-04

    申请号:US17564860

    申请日:2021-12-29

    Applicant: Google LLC

    Abstract: Systems and methods of the present disclosure can include a computer-implemented method for efficient machine-learned model training. The method can include obtaining a plurality of training samples for a machine-learned model. The method can include, for one or more first training iterations, training, based at least in part on a first regularization magnitude configured to control a relative effect of one or more regularization techniques, the machine-learned model using one or more respective first training samples of the plurality of training samples. The method can include, for one or more second training iterations, training, based at least in part on a second regularization magnitude greater than the first regularization magnitude, the machine-learned model using one or more respective second training samples of the plurality of training samples.

    CONNECTION WEIGHT LEARNING FOR GUIDED ARCHITECTURE EVOLUTION

    公开(公告)号:US20220189154A1

    公开(公告)日:2022-06-16

    申请号:US17605783

    申请日:2020-05-22

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining one or more neural network architectures of a neural network for performing a video processing neural network task. In one aspect, a method comprises: at each of a plurality of iterations: selecting a parent neural network architecture from a set of neural network architectures; training a neural network having the parent neural network architecture to perform the video processing neural network task, comprising determining trained values of connection weight parameters of the parent neural network architecture; generating a new neural network architecture based at least in part on the trained values of the connection weight parameters of the parent neural network architecture; and adding the new neural network architecture to the set of neural network architectures.

    Neural Architecture Search with Factorized Hierarchical Search Space

    公开(公告)号:US20220101090A1

    公开(公告)日:2022-03-31

    申请号:US17495398

    申请日:2021-10-06

    Applicant: Google LLC

    Abstract: The present disclosure is directed to an automated neural architecture search approach for designing new neural network architectures such as, for example, resource-constrained mobile CNN models. In particular, the present disclosure provides systems and methods to perform neural architecture search using a novel factorized hierarchical search space that permits layer diversity throughout the network, thereby striking the right balance between flexibility and search space size. The resulting neural architectures are able to be run relatively faster and using relatively fewer computing resources (e.g., less processing power, less memory usage, less power consumption, etc.), all while remaining competitive with or even exceeding the performance (e.g., accuracy) of current state-of-the-art mobile-optimized models.

    COMPOUND MODEL SCALING FOR NEURAL NETWORKS

    公开(公告)号:US20210133578A1

    公开(公告)日:2021-05-06

    申请号:US17144450

    申请日:2021-01-08

    Applicant: Google LLC

    Abstract: A method for determining a final architecture for a neural network to perform a particular machine learning task is described. The method includes receiving a baseline architecture for the neural network, wherein the baseline architecture has a network width dimension, a network depth dimension, and a resolution dimension; receiving data defining a compound coefficient that controls extra computational resources used for scaling the baseline architecture; performing a search to determine a baseline width, depth and resolution coefficient that specify how to assign the extra computational resources to the network width, depth and resolution dimensions of the baseline architecture, respectively; determining a width, depth and resolution coefficient based on the baseline width, depth, and resolution coefficient and the compound coefficient; and generating the final architecture that scales the network width, network depth, and resolution dimensions of the baseline architecture based on the corresponding width, depth, and resolution coefficients.

    Scale-Permuted Machine Learning Architecture

    公开(公告)号:US20240378509A1

    公开(公告)日:2024-11-14

    申请号:US18784068

    申请日:2024-07-25

    Applicant: Google LLC

    Abstract: A computer-implemented method of generating scale-permuted models can generate models having improved accuracy and reduced evaluation computational requirements. The method can include defining, by a computing system including one or more computing devices, a search space including a plurality of candidate permutations of a plurality of candidate feature blocks, each of the plurality of candidate feature blocks having a respective scale. The method can include performing, by the computing system, a plurality of search iterations by a search algorithm to select a scale-permuted model from the search space, the scale-permuted model based at least in part on a candidate permutation of the plurality of candidate permutations.

    Scale-permuted machine learning architecture

    公开(公告)号:US12079695B2

    公开(公告)日:2024-09-03

    申请号:US17061355

    申请日:2020-10-01

    Applicant: Google LLC

    CPC classification number: G06N20/00 G06F11/3495 G06N3/04

    Abstract: A computer-implemented method of generating scale-permuted models can generate models having improved accuracy and reduced evaluation computational requirements. The method can include defining, by a computing system including one or more computing devices, a search space including a plurality of candidate permutations of a plurality of candidate feature blocks, each of the plurality of candidate feature blocks having a respective scale. The method can include performing, by the computing system, a plurality of search iterations by a search algorithm to select a scale-permuted model from the search space, the scale-permuted model based at least in part on a candidate permutation of the plurality of candidate permutations.

    Systems and Methods for Progressive Learning for Machine-Learned Models to Optimize Training Speed

    公开(公告)号:US20230017808A1

    公开(公告)日:2023-01-19

    申请号:US17943880

    申请日:2022-09-13

    Applicant: Google LLC

    Abstract: Systems and methods of the present disclosure can include a computer-implemented method for efficient machine-learned model training. The method can include obtaining a plurality of training samples for a machine-learned model. The method can include, for one or more first training iterations, training, based at least in part on a first regularization magnitude configured to control a relative effect of one or more regularization techniques, the machine-learned model using one or more respective first training samples of the plurality of training samples. The method can include, for one or more second training iterations, training, based at least in part on a second regularization magnitude greater than the first regularization magnitude, the machine-learned model using one or more respective second training samples of the plurality of training samples.

Patent Agency Ranking