Patent search ap:("DeepMind Technologies Limited") AND inv:"Piotr Wojciech Mirowski" Page 1

1.

发明申请
ENVIRONMENT NAVIGATION USING REINFORCEMENT LEARNING 审中-公开

公开(公告)号：US20200151515A1

公开(公告)日：2020-05-14

申请号：US16745757

申请日：2020-01-17

Applicant: DeepMind Technologies Limited

Inventor： Fabio Viola , Piotr Wojciech Mirowski , Andrea Banino , Razvan Pascanu , Hubert Josef Soyer , Andrew James Ballard , Sudarshan Kumaran , Raia Thais Hadsell , Laurent Sifre , Rostislav Goroshin , Koray Kavukcuoglu , Misha Man Ray Denil

IPC: G06K9/62 , G06N3/00 , G06N3/08 , G06N3/04 , G06K9/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a reinforcement learning system. In one aspect, a method of training an action selection policy neural network for use in selecting actions to be performed by an agent navigating through an environment to accomplish one or more goals comprises: receiving an observation image characterizing a current state of the environment; processing, using the action selection policy neural network, an input comprising the observation image to generate an action selection output; processing, using a geometry-prediction neural network, an intermediate output generated by the action selection policy neural network to predict a value of a feature of a geometry of the environment when in the current state; and backpropagating a gradient of a geometry-based auxiliary loss into the action selection policy neural network to determine a geometry-based auxiliary update for current values of the network parameters.

2.

发明公开
NOWCASTING USING GENERATIVE NEURAL NETWORKS 审中-公开

公开(公告)号：US20240176045A1

公开(公告)日：2024-05-30

申请号：US18277729

申请日：2022-02-16

Applicant: DeepMind Technologies Limited

Inventor： Suman Ravuri , Karel Lenc , Piotr Wojciech Mirowski , Remi Roger Alain Paul Lam , Matthew James Willson , Andrew Brock

IPC: G01W1/10 , G06N3/0475

CPC classification number: G01W1/10 , G06N3/0475

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for precipitation nowcasting using generative neural networks. One of the methods includes obtaining a context temporal sequence of a plurality of context radar fields characterizing a real-world location, each context radar field characterizing the weather in the real-world location at a corresponding preceding time point; sampling a set of one or more latent inputs by sampling values from a specified distribution; and for each sampled latent input, processing the context temporal sequence of radar fields and the sampled latent input using a generative neural network that has been configured through training to process the temporal sequence of radar fields to generate as output a predicted temporal sequence comprising a plurality of predicted radar fields, each predicted radar field in the predicted temporal sequence characterizing the predicted weather in the real-world location at a corresponding future time point.

3.

发明授权
Environment navigation using reinforcement learning 有权

公开(公告)号：US11074481B2

公开(公告)日：2021-07-27

申请号：US16745757

申请日：2020-01-17

Applicant: DeepMind Technologies Limited

Inventor： Fabio Viola , Piotr Wojciech Mirowski , Andrea Banino , Razvan Pascanu , Hubert Josef Soyer , Andrew James Ballard , Sudarshan Kumaran , Raia Thais Hadsell , Laurent Sifre , Rostislav Goroshin , Koray Kavukcuoglu , Misha Man Ray Denil

IPC: G06K9/00 , G06K9/62 , G06N3/04 , G06N3/08 , G06N3/00 , G06T7/50 , G06T7/70

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a reinforcement learning system. In one aspect, a method of training an action selection policy neural network for use in selecting actions to be performed by an agent navigating through an environment to accomplish one or more goals comprises: receiving an observation image characterizing a current state of the environment; processing, using the action selection policy neural network, an input comprising the observation image to generate an action selection output; processing, using a geometry-prediction neural network, an intermediate output generated by the action selection policy neural network to predict a value of a feature of a geometry of the environment when in the current state; and backpropagating a gradient of a geometry-based auxiliary loss into the action selection policy neural network to determine a geometry-based auxiliary update for current values of the network parameters.

4.

发明申请
ENVIRONMENT NAVIGATION USING REINFORCEMENT LEARNING 审中-公开

公开(公告)号：US20190266449A1

公开(公告)日：2019-08-29

申请号：US16403343

申请日：2019-05-03

Applicant: DeepMind Technologies Limited

Inventor： Fabio Viola , Piotr Wojciech Mirowski , Andrea Banino , Razvan Pascanu , Hubert Josef Soyer , Andrew James Ballard , Sudarshan Kumaran , Raia Thais Hadsell , Laurent Sifre , Rostislav Goroshin , Koray Kavukcuoglu , Misha Man Ray Denil

IPC: G06K9/62 , G06N3/08 , G06N3/04 , G06K9/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a reinforcement learning system. In one aspect, a method of training an action selection policy neural network for use in selecting actions to be performed by an agent navigating through an environment to accomplish one or more goals comprises: receiving an observation image characterizing a current state of the environment; processing, using the action selection policy neural network, an input comprising the observation image to generate an action selection output; processing, using a geometry-prediction neural network, an intermediate output generated by the action selection policy neural network to predict a value of a feature of a geometry of the environment when in the current state; and backpropagating a gradient of a geometry-based auxiliary loss into the action selection policy neural network to determine a geometry-based auxiliary update for current values of the network parameters.

5.

发明公开
HIERARCHICAL TEXT GENERATION USING LANGUAGE MODEL NEURAL NETWORKS 审中-公开

公开(公告)号：US20240184982A1

公开(公告)日：2024-06-06

申请号：US18527211

申请日：2023-12-01

Applicant: DeepMind Technologies Limited

Inventor： Kory Wallace Mathewson , Piotr Wojciech Mirowski , Richard Andrew Evans

IPC: G06F40/20

CPC classification number: G06F40/20

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating long textual works using language model neural networks. For example, the textual works can be generated hierarchically by performing a hierarchy of generation steps using the same language model neural network.

6.

发明授权
Environment navigation using reinforcement learning 有权

公开(公告)号：US10572776B2

公开(公告)日：2020-02-25

申请号：US16403343

申请日：2019-05-03

Applicant: DeepMind Technologies Limited

Inventor： Fabio Viola , Piotr Wojciech Mirowski , Andrea Banino , Razvan Pascanu , Hubert Josef Soyer , Andrew James Ballard , Sudarshan Kumaran , Raia Thais Hadsell , Laurent Sifre , Rostislav Goroshin , Koray Kavukcuoglu , Misha Man Ray Denil

IPC: G06K9/00 , G06K9/62 , G06N3/04 , G06N3/08 , G06N3/00 , G06T7/50 , G06T7/70

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a reinforcement learning system. In one aspect, a method of training an action selection policy neural network for use in selecting actions to be performed by an agent navigating through an environment to accomplish one or more goals comprises: receiving an observation image characterizing a current state of the environment; processing, using the action selection policy neural network, an input comprising the observation image to generate an action selection output; processing, using a geometry-prediction neural network, an intermediate output generated by the action selection policy neural network to predict a value of a feature of a geometry of the environment when in the current state; and backpropagating a gradient of a geometry-based auxiliary loss into the action selection policy neural network to determine a geometry-based auxiliary update for current values of the network parameters.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification