Patent search ap:("DEEPMIND TECHNOLOGIES LIMITED") AND inv:"Razvan Pascanu" Page 3

21.

发明公开
TRAINING REINFORCEMENT LEARNING AGENTS USING AUGMENTED TEMPORAL DIFFERENCE LEARNING 审中-公开

公开(公告)号：US20230376780A1

公开(公告)日：2023-11-23

申请号：US18029979

申请日：2021-10-01

Applicant: DeepMind Technologies Limited

Inventor： Caglar Gulcehre , Razvan Pascanu , Sergio Gomez

IPC: G06N3/092 , G06N3/0442

CPC classification number: G06N3/092 , G06N3/0442

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network used to select actions performed by an agent interacting with an environment by performing actions that cause the environment to transition states. One of the methods includes maintaining a replay memory storing a plurality of transitions; selecting a plurality of transitions from the replay memory; and training the neural network on the plurality of transitions, comprising, for each transition: generating an initial Q value for the transition; determining a scaled Q value for the transition; determining a scaled temporal difference learning target for the transition; determining an error between the scaled temporal difference learning target and the scaled Q value; determining an update to the current values of the Q network parameters; and determining an update to the current value of the scaling term.

22.

发明申请
MULTI-TASK NEURAL NETWORK SYSTEMS WITH TASK-SPECIFIC POLICIES AND A SHARED POLICY 有权

公开(公告)号：US20220083869A1

公开(公告)日：2022-03-17

申请号：US17486842

申请日：2021-09-27

Applicant: DeepMind Technologies Limited

Inventor： Razvan Pascanu , Raia Thais Hadsell , Victor Constant Bapst , Wojciech Czarnecki , James Kirkpatrick , Yee Whye Teh , Nicolas Manfred Otto Heess

IPC: G06N3/08 , G06N3/10 , G06N5/04

Abstract: A method is proposed for training a multitask computer system, such as a multitask neural network system. The system comprises a set of trainable workers and a shared module. The trainable workers and shared module are trained on a plurality of different tasks, such that each worker learns to perform a corresponding one of the tasks according to a respective task policy, and said shared policy network learns a multitask policy which represents common behavior for the tasks. The coordinated training is performed by optimizing an objective function comprising, for each task: a reward term indicative of an expected reward earned by a worker in performing the corresponding task according to the task policy; and at least one entropy term which regularizes the distribution of the task policy towards the distribution of the multitask policy.

23.

发明授权
Multi-task neural network systems with task-specific policies and a shared policy 有权

公开(公告)号：US11132609B2

公开(公告)日：2021-09-28

申请号：US16689020

申请日：2019-11-19

Applicant: DeepMind Technologies Limited

Inventor： Razvan Pascanu , Raia Thais Hadsell , Victor Constant Bapst , Wojciech Czarnecki , James Kirkpatrick , Yee Whye Teh , Nicolas Manfred Otto Heess

IPC: G06N3/08 , G06N3/10 , G06N5/04

Abstract: A method is proposed for training a multitask computer system, such as a multitask neural network system. The system comprises a set of trainable workers and a shared module. The trainable workers and shared module are trained on a plurality of different tasks, such that each worker learns to perform a corresponding one of the tasks according to a respective task policy, and said shared policy network learns a multitask policy which represents common behavior for the tasks. The coordinated training is performed by optimizing an objective function comprising, for each task: a reward term indicative of an expected reward earned by a worker in performing the corresponding task according to the task policy; and at least one entropy term which regularizes the distribution of the task policy towards the distribution of the multitask policy.

24.

发明申请
NEURAL NETWORKS FOR SCALABLE CONTINUAL LEARNING IN DOMAINS WITH SEQUENTIALLY LEARNED TASKS 有权

公开(公告)号：US20210117786A1

公开(公告)日：2021-04-22

申请号：US17048023

申请日：2019-04-18

Applicant: DEEPMIND TECHNOLOGIES LIMITED

Inventor： Jonathan Schwarz , Razvan Pascanu , Raia Thais Hadsell , Wojciech Czarnecki , Yee Whye Teh , Jelena Luketina

IPC: G06N3/08 , G06N5/02 , G06N20/20 , G06K9/62

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scalable continual learning using neural networks. One of the methods includes receiving new training data for a new machine learning task; training an active subnetwork on the new training data to determine trained values of the active network parameters from initial values of the active network parameters while holding current values of the knowledge parameters fixed; and training a knowledge subnetwork on the new training data to determine updated values of the knowledge parameters from the current values of the knowledge parameters by training the knowledge subnetwork to generate knowledge outputs for the new training inputs that match active outputs generated by the trained active subnetwork for the new training inputs.

25.

发明申请
IMAGINATION-BASED AGENT NEURAL NETWORKS 审中-公开

公开(公告)号：US20200082227A1

公开(公告)日：2020-03-12

申请号：US16689017

申请日：2019-11-19

Applicant: DeepMind Technologies Limited

Inventor： Daniel Pieter Wierstra , Yujia Li , Razvan Pascanu , Peter William Battaglia , Theophane Guillaume Weber , Lars Buesing , David Paul Reichert , Oriol Vinyals , Nicolas Manfred Otto Heess , Sebastien Henri Andre Racaniere

IPC: G06K9/62 , G06N3/08 , G06K9/68 , G06N5/04

Abstract: A neural network system is proposed to select actions to be performed by an agent interacting with an environment to perform a task in an attempt to achieve a specified result. The system may include a controller to receive state data and context data, and to output action data. The system may also include an imagination module to receive the state and action data, and to output consequent state data. The system may also include a manager to receive the state data and the context data, and to output route data which defines whether the system is to execute an action or to imagine. The system may also include a memory to store the context data.

26.

发明申请
LOW-PASS RECURRENT NEURAL NETWORK SYSTEMS WITH MEMORY 审中-公开

公开(公告)号：US20190251419A1

公开(公告)日：2019-08-15

申请号：US16272880

申请日：2019-02-11

Applicant: DeepMind Technologies Limited

Inventor： Razvan Pascanu , William Clinton Dabney , Thomas Stepleton

IPC: G06N3/04 , G06N3/08

CPC classification number: G06N3/04 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing and storing inputs for use in a neural network. One of the methods includes receiving input data for storage in a memory system comprising a first set of memory blocks, the memory blocks having an associated order; passing the input data to a highest ordered memory block; for each memory block for which there is a lower ordered memory block: applying a filter function to data currently stored by the memory block to generate filtered data and passing the filtered data to a lower ordered memory block; and for each memory block: combining the data currently stored in the memory block with the data passed to the memory block to generate updated data, and storing the updated data in the memory block.

27.

发明申请
NEURAL NETWORKS FOR SELECTING ACTIONS TO BE PERFORMED BY A ROBOTIC AGENT 审中-公开

公开(公告)号：US20190232489A1

公开(公告)日：2019-08-01

申请号：US16380125

申请日：2019-04-10

Applicant: DeepMind Technologies Limited

Inventor： Razvan Pascanu , Raia Thais Hadsell , Mel Vecerik , Thomas Rothoerl , Andrei-Alexandru Rusu , Nicolas Manfred Otto Heess

IPC: B25J9/16 , G06N3/04 , G06N3/08 , G05B13/02

CPC classification number: B25J9/163 , B25J9/1671 , G05B13/027 , G06N3/008 , G06N3/0445 , G06N3/0454 , G06N3/08

Abstract: A system includes a neural network system implemented by one or more computers. The neural network system is configured to receive an observation characterizing a current state of a real-world environment being interacted with by a robotic agent to perform a robotic task and to process the observation to generate a policy output that defines an action to be performed by the robotic agent in response to the observation. The neural network system includes: (i) a sequence of deep neural networks (DNNs), in which the sequence of DNNs includes a simulation-trained DNN that has been trained on interactions of a simulated version of the robotic agent with a simulated version of the real-world environment to perform a simulated version of the robotic task, and (ii) a first robot-trained DNN that is configured to receive the observation and to process the observation to generate the policy output.

28.

发明授权
Neural networks for scalable continual learning in domains with sequentially learned tasks 有权

公开(公告)号：US12020164B2

公开(公告)日：2024-06-25

申请号：US17048023

申请日：2019-04-18

Applicant: DEEPMIND TECHNOLOGIES LIMITED

Inventor： Jonathan Schwarz , Razvan Pascanu , Raia Thais Hadsell , Wojciech Czarnecki , Yee Whye Teh , Jelena Luketina

IPC: G06N3/08 , G06F18/22 , G06N3/084 , G06N5/02 , G06N20/20

CPC classification number: G06N3/084 , G06F18/22 , G06N3/08 , G06N5/02 , G06N20/20

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scalable continual learning using neural networks. One of the methods includes receiving new training data for a new machine learning task; training an active subnetwork on the new training data to determine trained values of the active network parameters from initial values of the active network parameters while holding current values of the knowledge parameters fixed; and training a knowledge subnetwork on the new training data to determine updated values of the knowledge parameters from the current values of the knowledge parameters by training the knowledge subnetwork to generate knowledge outputs for the new training inputs that match active outputs generated by the trained active subnetwork for the new training inputs.

29.

发明申请
GATED ATTENTION NEURAL NETWORKS 有权

公开(公告)号：US20220366218A1

公开(公告)日：2022-11-17

申请号：US17763984

申请日：2020-09-07

Applicant: DeepMind Technologies Limited

Inventor： Emilio Parisotto , Hasuk Song , Jack William Rae , Siddhant Madhu Jayakumar , Maxwell Elliot Jaderberg , Razvan Pascanu , Caglar Gulcehre

IPC: G06N3/04 , G06N3/08

Abstract: A system including an attention neural network that is configured to receive an input sequence and to process the input sequence to generate an output is described. The attention neural network includes: an attention block configured to receive a query input, a key input, and a value input that are derived from an attention block input. The attention block includes an attention neural network layer configured to: receive an attention layer input derived from the query input, the key input, and the value input, and apply an attention mechanism to the query input, the key input, and the value input to generate an attention layer output for the attention neural network layer; and a gating neural network layer configured to apply a gating mechanism to the attention block input and the attention layer output of the attention neural network layer to generate a gated attention output.

30.

发明申请
NEURAL NETWORKS FOR SELECTING ACTIONS TO BE PERFORMED BY A ROBOTIC AGENT 有权

公开(公告)号：US20220355472A1

公开(公告)日：2022-11-10

申请号：US17872528

申请日：2022-07-25

Applicant: DeepMind Technologies Limited

Inventor： Razvan Pascanu , Raia Thais Hadsell , Mel Vecerik , Thomas Rothoerl , Andrei-Alexandru Rusu , Nicolas Manfred Otto Heess

IPC: B25J9/16 , G06N3/04 , G06N3/08 , G05B13/02 , G06N3/00

Abstract: A system includes a neural network system implemented by one or more computers. The neural network system is configured to receive an observation characterizing a current state of a real-world environment being interacted with by a robotic agent to perform a robotic task and to process the observation to generate a policy output that defines an action to be performed by the robotic agent in response to the observation. The neural network system includes: (i) a sequence of deep neural networks (DNNs), in which the sequence of DNNs includes a simulation-trained DNN that has been trained on interactions of a simulated version of the robotic agent with a simulated version of the real-world environment to perform a simulated version of the robotic task, and (ii) a first robot-trained DNN that is configured to receive the observation and to process the observation to generate the policy output.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification