Patent search ap:("DeepMind Technologies Limited") AND inv:"Sebastien Henri Andre Racaniere" Page 1

1.

发明授权
Imagination-based agent neural networks 有权

公开(公告)号：US11328183B2

公开(公告)日：2022-05-10

申请号：US17019919

申请日：2020-09-14

Applicant: DeepMind Technologies Limited

Inventor： Daniel Pieter Wierstra , Yujia Li , Razvan Pascanu , Peter William Battaglia , Theophane Guillaume Weber , Lars Buesing , David Paul Reichert , Arthur Clement Guez , Danilo Jimenez Rezende , Adrià Puigdomènech Badia , Oriol Vinyals , Nicolas Manfred Otto Heess , Sebastien Henri Andre Racaniere

IPC: G06K9/00 , G06K9/62 , G06K9/68 , G06N3/04 , G06N3/08

Abstract: A neural network system is proposed. The neural network can be trained by model-based reinforcement learning to select actions to be performed by an agent interacting with an environment, to perform a task in an attempt to achieve a specified result. The system may comprise at least one imagination core which receives a current observation characterizing a current state of the environment, and optionally historical observations, and which includes a model of the environment. The imagination core may be configured to output trajectory data in response to the current observation, and/or historical observations. The trajectory data comprising a sequence of future features of the environment imagined by the imagination core. The system may also include a rollout encoder to encode the features, and an output stage to receive data derived from the rollout embedding and to output action policy data for identifying an action based on the current observation.

2.

发明申请
IMAGINATION-BASED AGENT NEURAL NETWORKS 有权

公开(公告)号：US20210089834A1

公开(公告)日：2021-03-25

申请号：US17114324

申请日：2020-12-07

Applicant: DeepMind Technologies Limited

Inventor： Daniel Pieter Wierstra , Yujia Li , Razvan Pascanu , Peter William Battaglia , Theophane Guillaume Weber , Lars Buesing , David Paul Reichert , Oriol Vinyals , Nicolas Manfred Otto Heess , Sebastien Henri Andre Racaniere

IPC: G06K9/62 , G06K9/68 , G06N3/08 , G06N5/04

Abstract: A neural network system is proposed to select actions to be performed by an agent interacting with an environment to perform a task in an attempt to achieve a specified result. The system may include a controller to receive state data and context data, and to output action data. The system may also include an imagination module to receive the state and action data, and to output consequent state data. The system may also include a manager to receive the state data and the context data, and to output route data which defines whether the system is to execute an action or to imagine. The system may also include a memory to store the context data.

3.

发明申请
RECURRENT ENVIRONMENT PREDICTORS 审中-公开

公开(公告)号：US20190266475A1

公开(公告)日：2019-08-29

申请号：US16403352

申请日：2019-05-03

Applicant: DeepMind Technologies Limited

Inventor： Daniel Pieter Wierstra , Shakir Mohamed , Silvia Chiappa , Sebastien Henri Andre Racaniere

IPC: G06N3/04 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for environment simulation. In one aspect, a system comprises a recurrent neural network configured to, at each of a plurality of time steps, receive a preceding action for a preceding time step, update a preceding initial hidden state of the recurrent neural network from the preceding time step using the preceding action, update a preceding cell state of the recurrent neural network from the preceding time step using at least the initial hidden state for the time step, and determine a final hidden state for the time step using the cell state for the time step. The system further comprises a decoder neural network configured to receive the final hidden state for the time step and process the final hidden state to generate a predicted observation characterizing a predicted state of the environment at the time step.

4.

发明授权
Recurrent environment predictors 有权

公开(公告)号：US11200482B2

公开(公告)日：2021-12-14

申请号：US16893565

申请日：2020-06-05

Applicant: DeepMind Technologies Limited

Inventor： Daniel Pieter Wierstra , Shakir Mohamed , Silvia Chiappa , Sebastien Henri Andre Racaniere

IPC: G06N3/04 , G06N3/08 , G06N3/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for environment simulation. In one aspect, a system comprises a recurrent neural network configured to, at each of a plurality of time steps, receive a preceding action for a preceding time step, update a preceding initial hidden state of the recurrent neural network from the preceding time step using the preceding action, update a preceding cell state of the recurrent neural network from the preceding time step using at least the initial hidden state for the time step, and determine a final hidden state for the time step using the cell state for the time step. The system further comprises a decoder neural network configured to receive the final hidden state for the time step and process the final hidden state to generate a predicted observation characterizing a predicted state of the environment at the time step.

5.

发明授权
Recurrent environment predictors 有权

公开(公告)号：US10713559B2

公开(公告)日：2020-07-14

申请号：US16403352

申请日：2019-05-03

Applicant: DeepMind Technologies Limited

Inventor： Daniel Pieter Wierstra , Shakir Mohamed , Silvia Chiappa , Sebastien Henri Andre Racaniere

IPC: G06N3/04 , G06N3/08 , G06N3/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for environment simulation. In one aspect, a system comprises a recurrent neural network configured to, at each of a plurality of time steps, receive a preceding action for a preceding time step, update a preceding initial hidden state of the recurrent neural network from the preceding time step using the preceding action, update a preceding cell state of the recurrent neural network from the preceding time step using at least the initial hidden state for the time step, and determine a final hidden state for the time step using the cell state for the time step. The system further comprises a decoder neural network configured to receive the final hidden state for the time step and process the final hidden state to generate a predicted observation characterizing a predicted state of the environment at the time step.

6.

发明申请
IMAGINATION-BASED AGENT NEURAL NETWORKS 审中-公开

公开(公告)号：US20200082227A1

公开(公告)日：2020-03-12

申请号：US16689017

申请日：2019-11-19

Applicant: DeepMind Technologies Limited

Inventor： Daniel Pieter Wierstra , Yujia Li , Razvan Pascanu , Peter William Battaglia , Theophane Guillaume Weber , Lars Buesing , David Paul Reichert , Oriol Vinyals , Nicolas Manfred Otto Heess , Sebastien Henri Andre Racaniere

IPC: G06K9/62 , G06N3/08 , G06K9/68 , G06N5/04

Abstract: A neural network system is proposed to select actions to be performed by an agent interacting with an environment to perform a task in an attempt to achieve a specified result. The system may include a controller to receive state data and context data, and to output action data. The system may also include an imagination module to receive the state and action data, and to output consequent state data. The system may also include a manager to receive the state data and the context data, and to output route data which defines whether the system is to execute an action or to imagine. The system may also include a memory to store the context data.

7.

发明申请
IMAGINATION-BASED AGENT NEURAL NETWORKS 有权

公开(公告)号：US20210073594A1

公开(公告)日：2021-03-11

申请号：US17019919

申请日：2020-09-14

Applicant: DeepMind Technologies Limited

Inventor： Daniel Pieter Wierstra , Yujia Li , Razvan Pascanu , Peter William Battaglia , Theophane Guillaume Weber , Lars Buesing , David Paul Reichert , Arthur Clement Guez , Danilo Jimenez Rezende , Adrià Puigdomènech Badia , Oriol Vinyals , Nicolas Manfred Otto Heess , Sebastien Henri Andre Racaniere

IPC: G06K9/62 , G06K9/68 , G06N3/04 , G06N3/08

Abstract: A neural network system is proposed. The neural network can be trained by model-based reinforcement learning to select actions to be performed by an agent interacting with an environment, to perform a task in an attempt to achieve a specified result. The system may comprise at least one imagination core which receives a current observation characterizing a current state of the environment, and optionally historical observations, and which includes a model of the environment. The imagination core may be configured to output trajectory data in response to the current observation, and/or historical observations. The trajectory data comprising a sequence of future features of the environment imagined by the imagination core. The system may also include a rollout encoder to encode the features, and an output stage to receive data derived from the rollout embedding and to output action policy data for identifying an action based on the current observation.

8.

发明授权
Imagination-based agent neural networks 有权

公开(公告)号：US10860895B2

公开(公告)日：2020-12-08

申请号：US16689017

申请日：2019-11-19

Applicant: DeepMind Technologies Limited

Inventor： Daniel Pieter Wierstra , Yujia Li , Razvan Pascanu , Peter William Battaglia , Theophane Guillaume Weber , Lars Buesing , David Paul Reichert , Oriol Vinyals , Nicolas Manfred Otto Heess , Sebastien Henri Andre Racaniere

IPC: G06K9/62 , G06K9/68 , G06N3/08 , G06N5/04

Abstract: A neural network system is proposed to select actions to be performed by an agent interacting with an environment to perform a task in an attempt to achieve a specified result. The system may include a controller to receive state data and context data, and to output action data. The system may also include an imagination module to receive the state and action data, and to output consequent state data. The system may also include a manager to receive the state data and the context data, and to output route data which defines whether the system is to execute an action or to imagine. The system may also include a memory to store the context data.

9.

发明申请
RECURRENT ENVIRONMENT PREDICTORS 审中-公开

公开(公告)号：US20200342289A1

公开(公告)日：2020-10-29

申请号：US16893565

申请日：2020-06-05

Applicant: DeepMind Technologies Limited

Inventor： Daniel Pieter Wierstra , Shakir Mohamed , Silvia Chiappa , Sebastien Henri Andre Racaniere

IPC: G06N3/04 , G06N3/08 , G06N3/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for environment simulation. In one aspect, a system comprises a recurrent neural network configured to, at each of a plurality of time steps, receive a preceding action for a preceding time step, update a preceding initial hidden state of the recurrent neural network from the preceding time step using the preceding action, update a preceding cell state of the recurrent neural network from the preceding time step using at least the initial hidden state for the time step, and determine a final hidden state for the time step using the cell state for the time step. The system further comprises a decoder neural network configured to receive the final hidden state for the time step and process the final hidden state to generate a predicted observation characterizing a predicted state of the environment at the time step.

10.

发明授权
Imagination-based agent neural networks 有权

公开(公告)号：US10776670B2

公开(公告)日：2020-09-15

申请号：US16689058

申请日：2019-11-19

Applicant: DeepMind Technologies Limited

Inventor： Daniel Pieter Wierstra , Yujia Li , Razvan Pascanu , Peter William Battaglia , Theophane Guillaume Weber , Lars Buesing , David Paul Reichert , Arthur Clement Guez , Danilo Jimenez Rezende , Adrià Puigdomènech Badia , Oriol Vinyals , Nicolas Manfred Otto Heess , Sebastien Henri Andre Racaniere

IPC: G06K9/00 , G06K9/62 , G06K9/68 , G06N3/04 , G06N3/08

Abstract: A neural network system is proposed. The neural network can be trained by model-based reinforcement learning to select actions to be performed by an agent interacting with an environment, to perform a task in an attempt to achieve a specified result. The system may comprise at least one imagination core which receives a current observation characterizing a current state of the environment, and optionally historical observations, and which includes a model of the environment. The imagination core may be configured to output trajectory data in response to the current observation, and/or historical observations. The trajectory data comprising a sequence of future features of the environment imagined by the imagination core. The system may also include a rollout encoder to encode the features, and an output stage to receive data derived from the rollout embedding and to output action policy data for identifying an action based on the current observation.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification