-
公开(公告)号:US20230259784A1
公开(公告)日:2023-08-17
申请号:US18140442
申请日:2023-04-27
Applicant: Google LLC
Inventor: Yanping Huang , Alok Aggarwal , Quoc V. Le , Esteban Alberto Real
Abstract: A method for receiving training data for training a neural network (NN) to perform a machine learning (ML) task and for determining, using the training data, an optimized NN architecture for performing the ML task is described. Determining the optimized NN architecture includes: maintaining population data comprising, for each candidate architecture in a population of candidate architectures, (i) data defining the candidate architecture, and (ii) data specifying how recently a neural network having the candidate architecture has been trained while determining the optimized neural network architecture; and repeatedly performing multiple operations using each of a plurality of worker computing units to generate a new candidate architecture based on a selected candidate architecture having the best measure of fitness, adding the new candidate architecture to the population, and removing from the population the candidate architecture that was trained least recently.
-
公开(公告)号:US11669744B2
公开(公告)日:2023-06-06
申请号:US17475137
申请日:2021-09-14
Applicant: Google LLC
Inventor: Yanping Huang , Alok Aggarwal , Quoc V. Le , Esteban Alberto Real
Abstract: A method for receiving training data for training a neural network (NN) to perform a machine learning (ML) task and for determining, using the training data, an optimized NN architecture for performing the ML task is described. Determining the optimized NN architecture includes: maintaining population data comprising, for each candidate architecture in a population of candidate architectures, (i) data defining the candidate architecture, and (ii) data specifying how recently a neural network having the candidate architecture has been trained while determining the optimized neural network architecture; and repeatedly performing multiple operations using each of a plurality of worker computing units to generate a new candidate architecture based on a selected candidate architecture having the best measure of fitness, adding the new candidate architecture to the population, and removing from the population the candidate architecture that was trained least recently.
-
公开(公告)号:US20220004879A1
公开(公告)日:2022-01-06
申请号:US17475137
申请日:2021-09-14
Applicant: Google LLC
Inventor: Yanping Huang , Alok Aggarwal , Quoc V. Le , Esteban Alberto Real
Abstract: A method for receiving training data for training a neural network (NN) to perform a machine learning (ML) task and for determining, using the training data, an optimized NN architecture for performing the ML task is described. Determining the optimized NN architecture includes: maintaining population data comprising, for each candidate architecture in a population of candidate architectures, (i) data defining the candidate architecture, and (ii) data specifying how recently a neural network having the candidate architecture has been trained while determining the optimized neural network architecture; and repeatedly performing multiple operations using each of a plurality of worker computing units to generate a new candidate architecture based on a selected candidate architecture having the best measure of fitness, adding the new candidate architecture to the population, and removing from the population the candidate architecture that was trained least recently.
-
公开(公告)号:US20210042620A1
公开(公告)日:2021-02-11
申请号:US16989787
申请日:2020-08-10
Applicant: Google LLC
Inventor: Zhifeng Chen , Yanping Huang , Youlong Cheng , HyoukJoong Lee , Dehao Chen , Jiquan Ngiam
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training giant neural networks. One of the methods includes obtaining data specifying a partitioning of the neural network into N composite layers that form a sequence of composite layers, wherein each composite layer comprises a distinct plurality of layers from the multiple network layers of the neural network; obtaining data assigning each of the N composite layers to one or more computing devices from a set of N computing devices; partitioning a mini-batch of training examples into a plurality of micro-batches; and training the neural network, comprising: performing a forward pass through the neural network until output activations have been computed for each micro-batch for a final composite layer in the sequence, and performing a backward pass through the neural network until output gradients have been computed for each micro-batch for the first composite layer in the sequence.
-
公开(公告)号:US20240378427A1
公开(公告)日:2024-11-14
申请号:US18661499
申请日:2024-05-10
Applicant: Google LLC
Inventor: Slav Petrov , Yonghui Wu , Andrew M. Dai , David Richard So , Dmitry Lepikhin , Erica Ann Moreira , Gaurav Mishra , Jonathan Hudson Clark , Maxim Krikun , Melvin Jose Johnson Premkumar , Nan Du , Orhan Firat , Rohan Anil , Siamak Shakeri , Xavier Garcia , Yanping Huang , Yong Cheng , Yuanzhong Xu , Yujing Zhang , Zachary Alexander Nado , Eric Jun Jie Ni , Kefan Xiao , Vladimir Feinberg , Jin Young Sohn , Aurko Roy
IPC: G06N3/0475 , G06F40/284
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network to perform any one or more of a variety of machine learning tasks. For example, the neural network can be configured as a generative neural network, e.g., an autoregressive generative neural network.
-
16.
公开(公告)号:US20240311402A1
公开(公告)日:2024-09-19
申请号:US18136634
申请日:2023-04-19
Applicant: GOOGLE LLC
Inventor: Martin Baeuml , Yanping Huang , Wenhao Jia , Chang Lan , Yuanzhong Xu , Junwhan Ahn , Alexander Bailey , Leif Schelin , Trevor Strohman , Emanuel Taropa , Sidharth Mudgal , Yanyan Zheng , Zhifeng Chen , Ahmad Beirami
IPC: G06F16/332 , G06F40/40
CPC classification number: G06F16/3322 , G06F16/3329 , G06F40/40
Abstract: Implementations relate to reducing latency in generating and/or rendering natural language (NL) output generated using a large language model (LLM). Processor(s) of a system can: receive NL based input associated with a client device, and generate the NL based output utilizing the LLM. The NL based output can be a stream of NL based output in that it includes a plurality of segments, and is generated on a segment-by-segment basis. In some implementations, a first segment of the stream of NL based output is selected for inclusion in the stream of NL based output as a second segment (and any subsequent segment) is being generated to reduce latency in evaluating the NL based output as a whole prior to rendering thereof. In some versions of those implementations, the first segment is rendered as the second segment (and any subsequent segment) is being generated to further reduce latency in rendering thereof.
-
公开(公告)号:US20220237435A1
公开(公告)日:2022-07-28
申请号:US17159437
申请日:2021-01-27
Applicant: Google LLC
Inventor: Yanping Huang , Dmitry Lepikhin , Maxim Krikun , Orhan Firat , Ankur Bapna , Thang Luong , Sneha Kudugunta
Abstract: Systems and methods for routing in mixture-of-expert models. In some aspects of the technology, a transformer may have at least one Mixture-of-Experts (“MoE”) layer in each of its encoder and decoder, with the at least one MoE layer of the encoder having a learned gating function configured to route each token of a task to two or more selected expert feed-forward networks, and the at least one MoE layer of the decoder having a learned gating function configured to route each task to two or more selected expert feed-forward networks.
-
公开(公告)号:US11232356B2
公开(公告)日:2022-01-25
申请号:US16989787
申请日:2020-08-10
Applicant: Google LLC
Inventor: Zhifeng Chen , Yanping Huang , Youlong Cheng , HyoukJoong Lee , Dehao Chen , Jiquan Ngiam
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training giant neural networks. One of the methods includes obtaining data specifying a partitioning of the neural network into N composite layers that form a sequence of composite layers, wherein each composite layer comprises a distinct plurality of layers from the multiple network layers of the neural network; obtaining data assigning each of the N composite layers to one or more computing devices from a set of N computing devices; partitioning a mini-batch of training examples into a plurality of micro-batches; and training the neural network, comprising: performing a forward pass through the neural network until output activations have been computed for each micro-batch for a final composite layer in the sequence, and performing a backward pass through the neural network until output gradients have been computed for each micro-batch for the first composite layer in the sequence.
-
-
-
-
-
-
-