-
1.
公开(公告)号:US20230315532A1
公开(公告)日:2023-10-05
申请号:US18127551
申请日:2023-03-28
Applicant: DeepMind Technologies Limited
Inventor: Jordan Hoffmann , Sebastian Borgeaud Dit Avocat , Laurent Sifre , Arthur Mensch
IPC: G06F9/50
CPC classification number: G06F9/505 , G06F9/5016 , G06F9/5044 , G06F2209/501 , G06F2209/5022 , G06F2209/506
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a machine learning model to perform a machine learning task. In one aspect, a method performed by one or more computer is described. The method includes: obtaining data defining a compute budget that characterizes an amount of computing resources allocated for training a machine learning model to perform a machine learning task; processing the data defining the compute budget using an allocation mapping, in accordance with a set of allocation mapping parameters, to generate an allocation tuple defining: (i) a target model size for the machine learning model, and (ii) a target amount of training data for training the machine learning model; instantiating the machine learning model, where the machine learning model has the target model size; and obtaining the target amount of training data for training the machine learning model.
-
公开(公告)号:US20230244907A1
公开(公告)日:2023-08-03
申请号:US18102985
申请日:2023-01-30
Applicant: DeepMind Technologies Limited
Inventor: Curtis Glenn-Macway Hawthorne , Andrew Coulter Jaegle , Catalina-Codruta Cangea , Sebastian Borgeaud Dit Avocat , Charlie Thomas Curtis Nash , Mateusz Malinowski , Sander Etienne Lea Dieleman , Oriol Vinyals , Matthew Botvinick , Ian Stuart Simon , Hannah Rachel Sheahan , Neil Zeghidour , Jean-Baptiste Alayrac , Joao Carreira , Jesse Engel
IPC: G06N3/044
CPC classification number: G06N3/044
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a sequence of data elements that includes a respective data element at each position in a sequence of positions. In one aspect, a method includes: for each position after a first position in the sequence of positions: obtaining a current sequence of data element embeddings that includes a respective data element embedding of each data element at a position that precedes the current position, obtaining a sequence of latent embeddings, and processing: (i) the current sequence of data element embeddings, and (ii) the sequence of latent embeddings, using a neural network to generate the data element at the current position. The neural network includes a sequence of neural network blocks including: (i) a cross-attention block, (ii) one or more self-attention blocks, and (iii) an output block.
-
3.
公开(公告)号:US20240232580A1
公开(公告)日:2024-07-11
申请号:US18284595
申请日:2022-05-27
Applicant: DEEPMIND TECHNOLOGIES LIMITED
Inventor: Andrew Coulter Jaegle , Jean-Baptiste Alayrac , Sebastian Borgeaud Dit Avocat , Catalin-Dumitru Ionescu , Carl Doersch , Fengning Ding , Oriol Vinyals , Olivier Jean Hénaff , Skanda Kumar Koppula , Daniel Zoran , Andrew Brock , Evan Gerard Shelhamer , Andrew Zisserman , Joao Carreira
IPC: G06N3/0455
CPC classification number: G06N3/0455
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a network output using a neural network. In one aspect, a method comprises: obtaining: (i) a network input to a neural network, and (ii) a set of query embeddings; processing the network input using the neural network to generate a network output that comprises a respective dimension corresponding to each query embedding in the set of query embeddings, comprising: processing the network input using an encoder block of the neural network to generate a representation of the network input as a set of latent embeddings; and processing: (i) the set of latent embeddings, and (ii) the set of query embeddings, using a cross-attention block that generates each dimension of the network output by cross-attention of a corresponding query embedding over the set of latent embeddings.
-
公开(公告)号:US20230177334A1
公开(公告)日:2023-06-08
申请号:US18076984
申请日:2022-12-07
Applicant: DeepMind Technologies Limited
Inventor: Sebastian Borgeaud Dit Avocat , Laurent Sifre , Arthur Mensch , Jordan Hoffmann
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a final output sequence. In one aspect, a method comprises: receiving a current output sequence comprising one or more current output segments; receiving a set of reference segments and a respective reference segment embedding of each reference segment that has been generated using an embedding neural network; for each current output segment: processing the current output segment using the embedding neural network to generate a current output segment embedding of the current output segment; and selecting k most similar reference segments to the current output segment using the reference segment embeddings and the current output segment embedding; and processing the current output sequence and the k most similar reference segments for each current output segment to generate an additional output segment that follows the current output sequence in the final output sequence.
-
-
-