-
公开(公告)号:US11365972B2
公开(公告)日:2022-06-21
申请号:US16794985
申请日:2020-02-19
Applicant: DeepMind Technologies Limited
Inventor: Andrea Banino , Sudarshan Kumaran , Raia Thais Hadsell , Benigno Uria-Martínez
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a grid cell neural network and an action selection neural network. The grid cell network is configured to: receive an input comprising data characterizing a velocity of the agent; process the input to generate a grid cell representation; and process the grid cell representation to generate an estimate of a position of the agent in the environment; the action selection neural network is configured to: receive an input comprising a grid cell representation and an observation characterizing a state of the environment; and process the input to generate an action selection network output.
-
公开(公告)号:US20240119262A1
公开(公告)日:2024-04-11
申请号:US18479775
申请日:2023-10-02
Applicant: DeepMind Technologies Limited
Inventor: Neil Charles Rabinowitz , Guillaume Desjardins , Andrei-Alexandru Rusu , Koray Kavukcuoglu , Raia Thais Hadsell , Razvan Pascanu , James Kirkpatrick , Hubert Josef Soyer
Abstract: Methods and systems for performing a sequence of machine learning tasks. One system includes a sequence of deep neural networks (DNNs), including: a first DNN corresponding to a first machine learning task, wherein the first DNN comprises a first plurality of indexed layers, and each layer in the first plurality of indexed layers is configured to receive a respective layer input and process the layer input to generate a respective layer output; and one or more subsequent DNNs corresponding to one or more respective machine learning tasks, wherein each subsequent DNN comprises a respective plurality of indexed layers, and each layer in a respective plurality of indexed layers with index greater than one receives input from a preceding layer of the respective subsequent DNN, and one or more preceding layers of respective preceding DNNs, wherein a preceding layer is a layer whose index is one less than the current index.
-
公开(公告)号:US20220276056A1
公开(公告)日:2022-09-01
申请号:US17747144
申请日:2022-05-18
Applicant: DeepMind Technologies Limited
Inventor: Andrea Banino , Sudarshan Kumaran , Raia Thais Hadsell , Benigno Uria-Martínez
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a grid cell neural network and an action selection neural network. The grid cell network is configured to: receive an input comprising data characterizing a velocity of the agent; process the input to generate a grid cell representation; and process the grid cell representation to generate an estimate of a position of the agent in the environment; the action selection neural network is configured to: receive an input comprising a grid cell representation and an observation characterizing a state of the environment; and process the input to generate an action selection network output.
-
公开(公告)号:US20220083869A1
公开(公告)日:2022-03-17
申请号:US17486842
申请日:2021-09-27
Applicant: DeepMind Technologies Limited
Inventor: Razvan Pascanu , Raia Thais Hadsell , Victor Constant Bapst , Wojciech Czarnecki , James Kirkpatrick , Yee Whye Teh , Nicolas Manfred Otto Heess
Abstract: A method is proposed for training a multitask computer system, such as a multitask neural network system. The system comprises a set of trainable workers and a shared module. The trainable workers and shared module are trained on a plurality of different tasks, such that each worker learns to perform a corresponding one of the tasks according to a respective task policy, and said shared policy network learns a multitask policy which represents common behavior for the tasks. The coordinated training is performed by optimizing an objective function comprising, for each task: a reward term indicative of an expected reward earned by a worker in performing the corresponding task according to the task policy; and at least one entropy term which regularizes the distribution of the task policy towards the distribution of the multitask policy.
-
公开(公告)号:US11132609B2
公开(公告)日:2021-09-28
申请号:US16689020
申请日:2019-11-19
Applicant: DeepMind Technologies Limited
Inventor: Razvan Pascanu , Raia Thais Hadsell , Victor Constant Bapst , Wojciech Czarnecki , James Kirkpatrick , Yee Whye Teh , Nicolas Manfred Otto Heess
Abstract: A method is proposed for training a multitask computer system, such as a multitask neural network system. The system comprises a set of trainable workers and a shared module. The trainable workers and shared module are trained on a plurality of different tasks, such that each worker learns to perform a corresponding one of the tasks according to a respective task policy, and said shared policy network learns a multitask policy which represents common behavior for the tasks. The coordinated training is performed by optimizing an objective function comprising, for each task: a reward term indicative of an expected reward earned by a worker in performing the corresponding task according to the task policy; and at least one entropy term which regularizes the distribution of the task policy towards the distribution of the multitask policy.
-
6.
公开(公告)号:US20210117786A1
公开(公告)日:2021-04-22
申请号:US17048023
申请日:2019-04-18
Applicant: DEEPMIND TECHNOLOGIES LIMITED
Inventor: Jonathan Schwarz , Razvan Pascanu , Raia Thais Hadsell , Wojciech Czarnecki , Yee Whye Teh , Jelena Luketina
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scalable continual learning using neural networks. One of the methods includes receiving new training data for a new machine learning task; training an active subnetwork on the new training data to determine trained values of the active network parameters from initial values of the active network parameters while holding current values of the knowledge parameters fixed; and training a knowledge subnetwork on the new training data to determine updated values of the knowledge parameters from the current values of the knowledge parameters by training the knowledge subnetwork to generate knowledge outputs for the new training inputs that match active outputs generated by the trained active subnetwork for the new training inputs.
-
公开(公告)号:US20190232489A1
公开(公告)日:2019-08-01
申请号:US16380125
申请日:2019-04-10
Applicant: DeepMind Technologies Limited
Inventor: Razvan Pascanu , Raia Thais Hadsell , Mel Vecerik , Thomas Rothoerl , Andrei-Alexandru Rusu , Nicolas Manfred Otto Heess
CPC classification number: B25J9/163 , B25J9/1671 , G05B13/027 , G06N3/008 , G06N3/0445 , G06N3/0454 , G06N3/08
Abstract: A system includes a neural network system implemented by one or more computers. The neural network system is configured to receive an observation characterizing a current state of a real-world environment being interacted with by a robotic agent to perform a robotic task and to process the observation to generate a policy output that defines an action to be performed by the robotic agent in response to the observation. The neural network system includes: (i) a sequence of deep neural networks (DNNs), in which the sequence of DNNs includes a simulation-trained DNN that has been trained on interactions of a simulated version of the robotic agent with a simulated version of the real-world environment to perform a simulated version of the robotic task, and (ii) a first robot-trained DNN that is configured to receive the observation and to process the observation to generate the policy output.
-
8.
公开(公告)号:US20240394540A1
公开(公告)日:2024-11-28
申请号:US18674367
申请日:2024-05-24
Applicant: DEEPMIND TECHNOLOGIES LIMITED
Inventor: Jonathan Schwarz , Razvan Pascanu , Raia Thais Hadsell , Wojciech Czarnecki , Yee Whye Teh , Jelena Luketina
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scalable continual learning using neural networks. One of the methods includes receiving new training data for a new machine learning task; training an active subnetwork on the new training data to determine trained values of the active network parameters from initial values of the active network parameters while holding current values of the knowledge parameters fixed; and training a knowledge subnetwork on the new training data to determine updated values of the knowledge parameters from the current values of the knowledge parameters by training the knowledge subnetwork to generate knowledge outputs for the new training inputs that match active outputs generated by the trained active subnetwork for the new training inputs.
-
公开(公告)号:US11983634B2
公开(公告)日:2024-05-14
申请号:US17486842
申请日:2021-09-27
Applicant: DeepMind Technologies Limited
Inventor: Razvan Pascanu , Raia Thais Hadsell , Victor Constant Bapst , Wojciech Czarnecki , James Kirkpatrick , Yee Whye Teh , Nicolas Manfred Otto Heess
Abstract: A method is proposed for training a multitask computer system, such as a multitask neural network system. The system comprises a set of trainable workers and a shared module. The trainable workers and shared module are trained on a plurality of different tasks, such that each worker learns to perform a corresponding one of the tasks according to a respective task policy, and said shared policy network learns a multitask policy which represents common behavior for the tasks. The coordinated training is performed by optimizing an objective function comprising, for each task: a reward term indicative of an expected reward earned by a worker in performing the corresponding task according to the task policy; and at least one entropy term which regularizes the distribution of the task policy towards the distribution of the multitask policy.
-
公开(公告)号:US20210201116A1
公开(公告)日:2021-07-01
申请号:US17201542
申请日:2021-03-15
Applicant: DeepMind Technologies Limited
Inventor: Neil Charles Rabinowitz , Guillaume Desjardins , Andrei-Alexandru Rusu , Koray Kavukcuoglu , Raia Thais Hadsell , Razvan Pascanu , James Kirkpatrick , Hubert Josef Soyer
Abstract: Methods and systems for performing a sequence of machine learning tasks. One system includes a sequence of deep neural networks (DNNs), including: a first DNN corresponding to a first machine learning task, wherein the first DNN comprises a first plurality of indexed layers, and each layer in the first plurality of indexed layers is configured to receive a respective layer input and process the layer input to generate a respective layer output; and one or more subsequent DNNs corresponding to one or more respective machine learning tasks, wherein each subsequent DNN comprises a respective plurality of indexed layers, and each layer in a respective plurality of indexed layers with index greater than one receives input from a preceding layer of the respective subsequent DNN, and one or more preceding layers of respective preceding DNNs, wherein a preceding layer is a layer whose index is one less than the current index.
-
-
-
-
-
-
-
-
-