Patent search ap:("DeepMind Technologies Limited") AND inv:"Arthur Mensch" Page 1

1.

发明公开
ALLOCATING COMPUTING RESOURCES BETWEEN MODEL SIZE AND TRAINING DATA DURING TRAINING OF A MACHINE LEARNING MODEL 审中-公开

公开(公告)号：US20230315532A1

公开(公告)日：2023-10-05

申请号：US18127551

申请日：2023-03-28

Applicant: DeepMind Technologies Limited

Inventor： Jordan Hoffmann , Sebastian Borgeaud Dit Avocat , Laurent Sifre , Arthur Mensch

IPC: G06F9/50

CPC classification number: G06F9/505 , G06F9/5016 , G06F9/5044 , G06F2209/501 , G06F2209/5022 , G06F2209/506

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a machine learning model to perform a machine learning task. In one aspect, a method performed by one or more computer is described. The method includes: obtaining data defining a compute budget that characterizes an amount of computing resources allocated for training a machine learning model to perform a machine learning task; processing the data defining the compute budget using an allocation mapping, in accordance with a set of allocation mapping parameters, to generate an allocation tuple defining: (i) a target model size for the machine learning model, and (ii) a target amount of training data for training the machine learning model; instantiating the machine learning model, where the machine learning model has the target model size; and obtaining the target amount of training data for training the machine learning model.

2.

发明公开
LANGUAGE MODEL FOR PROCESSING A MULTI-MODE QUERY INPUT 审中-公开

公开(公告)号：US20230350936A1

公开(公告)日：2023-11-02

申请号：US18141337

申请日：2023-04-28

Applicant: DeepMind Technologies Limited

Inventor： Jean-Baptiste Alayrac , Jeffrey Donahue , Karel Lenc , Karen Simonyan , Malcolm Kevin Campbell Reynolds , Pauline Luc , Arthur Mensch , Iain Barr , Antoine Miech , Yana Elizabeth Hasson , Katherine Elizabeth Millican , Roman Ring

IPC: G06F16/432 , G06F40/284 , G06F16/438

CPC classification number: G06F16/432 , G06F16/438 , G06F40/284

Abstract: A query processing system is described which receives a query input comprising an input token string and also at least one data item having a second, different modality, and generates a corresponding output token string.

3.

发明公开
LARGE SCALE RETRIEVAL FOR SEQUENCE GENERATION 审中-公开

公开(公告)号：US20230177334A1

公开(公告)日：2023-06-08

申请号：US18076984

申请日：2022-12-07

Applicant: DeepMind Technologies Limited

Inventor： Sebastian Borgeaud Dit Avocat , Laurent Sifre , Arthur Mensch , Jordan Hoffmann

IPC: G06N3/08

CPC classification number: G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a final output sequence. In one aspect, a method comprises: receiving a current output sequence comprising one or more current output segments; receiving a set of reference segments and a respective reference segment embedding of each reference segment that has been generated using an embedding neural network; for each current output segment: processing the current output segment using the embedding neural network to generate a current output segment embedding of the current output segment; and selecting k most similar reference segments to the current output segment using the reference segment embeddings and the current output segment embedding; and processing the current output sequence and the k most similar reference segments for each current output segment to generate an additional output segment that follows the current output sequence in the final output sequence.

4.

发明公开
DISCRETE TOKEN PROCESSING USING DIFFUSION MODELS 审中-公开

公开(公告)号：US20240119261A1

公开(公告)日：2024-04-11

申请号：US18374447

申请日：2023-09-28

Applicant: DeepMind Technologies Limited

Inventor： Robin Strudel , Rémi Leblond , Laurent Sifre , Sander Etienne Lea Dieleman , Nikolay Savinov , Will S. Grathwohl , Corentin Tallec , Florent Altché , Iaroslav Ganin , Arthur Mensch , Yilin Du

IPC: G06N3/045

CPC classification number: G06N3/045

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an output sequence of discrete tokens using a diffusion model. In one aspect, a method includes generating, by using the diffusion model, a final latent representation of the sequence of discrete tokens that includes a determined value for each of a plurality of latent variables; applying a de-embedding matrix to the final latent representation of the output sequence of discrete tokens to generate a de-embedded final latent representation that includes, for each of the plurality of latent variables, a respective numeric score for each discrete token in a vocabulary of multiple discrete tokens; selecting, for each of the plurality of latent variables, a discrete token from among the multiple discrete tokens in the vocabulary that has a highest numeric score; and generating the output sequence of discrete tokens that includes the selected discrete tokens.

5.

发明公开
TRAINING CONDITIONAL COMPUTATION NEURAL NETWORKS USING REINFORCEMENT LEARNING 审中-公开

公开(公告)号：US20230177309A1

公开(公告)日：2023-06-08

申请号：US18076978

申请日：2022-12-07

Applicant: DeepMind Technologies Limited

Inventor： Aidan Clark , Arthur Mensch

IPC: G06N3/04

CPC classification number: G06N3/0427

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network having one or more conditional computation layers, where each conditional computation layer includes a gating sub-layer having multiple gating parameters and an expert sub-layer having multiple expert neural networks. In one aspect, a method comprises: sampling a batch of target output sequences that comprises a respective ground truth output token at each of multiple output positions; for each target output sequence, processing the target output sequence using the neural network to generate a network output that includes respective score distributions over the vocabulary of output tokens for the output positions in the target output sequence; and training each gating sub-layer using respective rewards for the gating sub-layer for the output positions through reinforcement learning to optimize a reinforcement learning objective function that measures an expected reward received by the gating sub-layer.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification