-
1.
公开(公告)号:US20230315532A1
公开(公告)日:2023-10-05
申请号:US18127551
申请日:2023-03-28
Applicant: DeepMind Technologies Limited
Inventor: Jordan Hoffmann , Sebastian Borgeaud Dit Avocat , Laurent Sifre , Arthur Mensch
IPC: G06F9/50
CPC classification number: G06F9/505 , G06F9/5016 , G06F9/5044 , G06F2209/501 , G06F2209/5022 , G06F2209/506
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a machine learning model to perform a machine learning task. In one aspect, a method performed by one or more computer is described. The method includes: obtaining data defining a compute budget that characterizes an amount of computing resources allocated for training a machine learning model to perform a machine learning task; processing the data defining the compute budget using an allocation mapping, in accordance with a set of allocation mapping parameters, to generate an allocation tuple defining: (i) a target model size for the machine learning model, and (ii) a target amount of training data for training the machine learning model; instantiating the machine learning model, where the machine learning model has the target model size; and obtaining the target amount of training data for training the machine learning model.
-
公开(公告)号:US20230177334A1
公开(公告)日:2023-06-08
申请号:US18076984
申请日:2022-12-07
Applicant: DeepMind Technologies Limited
Inventor: Sebastian Borgeaud Dit Avocat , Laurent Sifre , Arthur Mensch , Jordan Hoffmann
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a final output sequence. In one aspect, a method comprises: receiving a current output sequence comprising one or more current output segments; receiving a set of reference segments and a respective reference segment embedding of each reference segment that has been generated using an embedding neural network; for each current output segment: processing the current output segment using the embedding neural network to generate a current output segment embedding of the current output segment; and selecting k most similar reference segments to the current output segment using the reference segment embeddings and the current output segment embedding; and processing the current output sequence and the k most similar reference segments for each current output segment to generate an additional output segment that follows the current output sequence in the final output sequence.
-