Patent search ap:("DeepMind Technologies Limited") AND inv:"Florent Altché" Page 1

1.

发明申请
LEARNING ENVIRONMENT REPRESENTATIONS FOR AGENT CONTROL USING PREDICTIONS OF BOOTSTRAPPED LATENTS 有权

公开(公告)号：US20230083486A1

公开(公告)日：2023-03-16

申请号：US17797886

申请日：2021-02-08

Applicant: DeepMind Technologies Limited

Inventor： Zhaohan Guo , Mohammad Gheshlaghi Azar , Bernardo Avila Pires , Florent Altché , Jean-Bastien François Laurent Grill , Bilal Piot , Remi Munos

IPC: G06N3/08 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an environment representation neural network of a reinforcement learning system controls an agent to perform a given task. In one aspect, the method includes: receiving a current observation input and a future observation input; generating, from the future observation input, a future latent representation of the future state of the environment; processing, using the environment representation neural network, to generate a current internal representation of the current state of the environment; generating, from the current internal representation, a predicted future latent representation; evaluating an objective function measuring a difference between the future latent representation and the predicted future latent representation; and determining, based on a determined gradient of the objective function, an update to the current values of the environment representation parameters.

2.

发明申请
REINFORCEMENT LEARNING USING HINDSIGHT TO MODEL UNPREDICTABLE ASPECTS OF THE FUTURE 有权

公开(公告)号：US20250068919A1

公开(公告)日：2025-02-27

申请号：US18238400

申请日：2023-08-25

Applicant: DeepMind Technologies Limited

Inventor： Daniel Jarrett , Corentin Tallec , Florent Altché , Thomas Mesnard , Remi Munos , Michal Valko

IPC: G06N3/092 , G06N3/044 , G06N3/045

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network used to select actions to be performed by an agent interacting with an environment. Implementations of the method model unpredictable aspects of the future, using hindsight. They use this information to disentangle inherently unpredictable, aleatoric variation, from epistemic uncertainty that arises from lack of knowledge of the environment. They then use the epistemic uncertainty, which relates to in principle predictable aspects of the environment, as a source of intrinsic reward to drive curiosity, i.e. exploration of the environment by the agent.

3.

发明公开
DISCRETE TOKEN PROCESSING USING DIFFUSION MODELS 审中-公开

公开(公告)号：US20240119261A1

公开(公告)日：2024-04-11

申请号：US18374447

申请日：2023-09-28

Applicant: DeepMind Technologies Limited

Inventor： Robin Strudel , Rémi Leblond , Laurent Sifre , Sander Etienne Lea Dieleman , Nikolay Savinov , Will S. Grathwohl , Corentin Tallec , Florent Altché , Iaroslav Ganin , Arthur Mensch , Yilin Du

IPC: G06N3/045

CPC classification number: G06N3/045

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an output sequence of discrete tokens using a diffusion model. In one aspect, a method includes generating, by using the diffusion model, a final latent representation of the sequence of discrete tokens that includes a determined value for each of a plurality of latent variables; applying a de-embedding matrix to the final latent representation of the output sequence of discrete tokens to generate a de-embedded final latent representation that includes, for each of the plurality of latent variables, a respective numeric score for each discrete token in a vocabulary of multiple discrete tokens; selecting, for each of the plurality of latent variables, a discrete token from among the multiple discrete tokens in the vocabulary that has a highest numeric score; and generating the output sequence of discrete tokens that includes the selected discrete tokens.

4.

发明申请
SELF-SUPERVISED REPRESENTATION LEARNING USING BOOTSTRAPPED LATENT REPRESENTATIONS 有权

公开(公告)号：US20210383225A1

公开(公告)日：2021-12-09

申请号：US17338777

申请日：2021-06-04

Applicant: DeepMind Technologies Limited

Inventor： Jean-Bastien François Laurent Grill , Florian Strub , Florent Altché , Corentin Tallec , Pierre Richemond , Bernardo Avila Pires , Zhaohan Guo , Mohammad Gheshlaghi Azar , Bilal Piot , Remi Munos , Michal Valko

IPC: G06N3/08 , H04L29/08

Abstract: A computer-implemented method of training a neural network. The method comprises processing a first transformed view of a training data item, e.g. an image, with a target neural network to generate a target output, processing a second transformed view of the training data item, e.g. image, with an online neural network to generate a prediction of the target output, updating parameters of the online neural network to minimize an error between the prediction of the target output and the target output, and updating parameters of the target neural network based on the parameters of the online neural network. The method can effectively train an encoder neural network without using labelled training data items, and without using a contrastive loss, i.e. without needing “negative examples” which comprise transformed views of different data items.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification