-
公开(公告)号:US20230196146A1
公开(公告)日:2023-06-22
申请号:US18168123
申请日:2023-02-13
Applicant: DeepMind Technologies Limited
Inventor: Yujia Li , Victor Constant Bapst , Vinicius Zambaldi , David Nunes Raposo , Adam Anthony Santoro
Abstract: A neural network system is proposed, including an input network for extracting, from state data, respective entity data for each a plurality of entities which are present, or at least potentially present, in the environment. The entity data describes the entity. The neural network contains a relational network for parsing this data, which includes one or more attention blocks which may be stacked to perform successive actions on the entity data. The attention blocks each include a respective transform network for each of the entities. The transform network for each entity is able to transform data which the transform network receives for the entity into modified entity data for the entity, based on data for a plurality of the other entities. An output network is arranged to receive data output by the relational network, and use the received data to select a respective action.
-
公开(公告)号:US20210334655A1
公开(公告)日:2021-10-28
申请号:US17240554
申请日:2021-04-26
Applicant: DeepMind Technologies Limited
Inventor: Annette Ada Nkechinyere Obika , Tian Xie , Victor Constant Bapst , Alexander Lloyd Gaunt , James Kirkpatrick
Abstract: Methods, computer systems, and apparatus, including computer programs encoded on computer storage media, for predicting one or more properties of a material. One of the methods includes maintaining data specifying a set of known materials each having a respective known physical structure; receiving data specifying a new material; identifying a plurality of known materials in the set of known materials that are similar to the new material; determining a predicted embedding of the new material from at least respective embeddings corresponding to each of the similar known materials; and processing the predicted embedding of the new material using an experimental prediction neural network to predict one or more properties of the new material.
-
公开(公告)号:US20250094772A1
公开(公告)日:2025-03-20
申请号:US18962266
申请日:2024-11-27
Applicant: DeepMind Technologies Limited
Inventor: Ziyu Wang , Nicolas Manfred Otto Heess , Victor Constant Bapst
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network. One of the methods includes maintaining a replay memory that stores trajectories generated as a result of interaction of an agent with an environment; and training an action selection neural network having policy parameters on the trajectories in the replay memory, wherein training the action selection neural network comprises: sampling a trajectory from the replay memory; and adjusting current values of the policy parameters by training the action selection neural network on the trajectory using an off-policy actor critic reinforcement learning technique.
-
公开(公告)号:US20190354885A1
公开(公告)日:2019-11-21
申请号:US16417580
申请日:2019-05-20
Applicant: DeepMind Technologies Limited
Inventor: Yujia Li , Victor Constant Bapst , Vinicius Zambaldi , David Nunes Raposo , Adam Anthony Santoro
Abstract: A neural network system is proposed, including an input network for extracting, from state data, respective entity data for each a plurality of entities which are present, or at least potentially present, in the environment. The entity data describes the entity. The neural network contains a relational network for parsing this data, which includes one or more attention blocks which may be stacked to perform successive actions on the entity data. The attention blocks each include a respective transform network for each of the entities. The transform network for each entity is able to transform data which the transform network receives for the entity into modified entity data for the entity, based on data for a plurality of the other entities. An output network is arranged to receive data output by the relational network, and use the received data to select a respective action.
-
公开(公告)号:US11983634B2
公开(公告)日:2024-05-14
申请号:US17486842
申请日:2021-09-27
Applicant: DeepMind Technologies Limited
Inventor: Razvan Pascanu , Raia Thais Hadsell , Victor Constant Bapst , Wojciech Czarnecki , James Kirkpatrick , Yee Whye Teh , Nicolas Manfred Otto Heess
Abstract: A method is proposed for training a multitask computer system, such as a multitask neural network system. The system comprises a set of trainable workers and a shared module. The trainable workers and shared module are trained on a plurality of different tasks, such that each worker learns to perform a corresponding one of the tasks according to a respective task policy, and said shared policy network learns a multitask policy which represents common behavior for the tasks. The coordinated training is performed by optimizing an objective function comprising, for each task: a reward term indicative of an expected reward earned by a worker in performing the corresponding task according to the task policy; and at least one entropy term which regularizes the distribution of the task policy towards the distribution of the multitask policy.
-
公开(公告)号:US11580429B2
公开(公告)日:2023-02-14
申请号:US16417580
申请日:2019-05-20
Applicant: DeepMind Technologies Limited
Inventor: Yujia Li , Victor Constant Bapst , Vinicius Zambaldi , David Nunes Raposo , Adam Anthony Santoro
Abstract: A neural network system is proposed, including an input network for extracting, from state data, respective entity data for each a plurality of entities which are present, or at least potentially present, in the environment. The entity data describes the entity. The neural network contains a relational network for parsing this data, which includes one or more attention blocks which may be stacked to perform successive actions on the entity data. The attention blocks each include a respective transform network for each of the entities. The transform network for each entity is able to transform data which the transform network receives for the entity into modified entity data for the entity, based on data for a plurality of the other entities. An output network is arranged to receive data output by the relational network, and use the received data to select a respective action.
-
7.
公开(公告)号:US20200293862A1
公开(公告)日:2020-09-17
申请号:US16885918
申请日:2020-05-28
Applicant: DeepMind Technologies Limited
Inventor: Ziyu Wang , Nicolas Manfred Otto Heess , Victor Constant Bapst
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network. One of the methods includes maintaining a replay memory that stores trajectories generated as a result of interaction of an agent with an environment; and training an action selection neural network having policy parameters on the trajectories in the replay memory, wherein training the action selection neural network comprises: sampling a trajectory from the replay memory; and adjusting current values of the policy parameters by training the action selection neural network on the trajectory using an off-policy actor critic reinforcement learning technique.
-
公开(公告)号:US12190223B2
公开(公告)日:2025-01-07
申请号:US16885918
申请日:2020-05-28
Applicant: DeepMind Technologies Limited
Inventor: Ziyu Wang , Nicolas Manfred Otto Heess , Victor Constant Bapst
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network. One of the methods includes maintaining a replay memory that stores trajectories generated as a result of interaction of an agent with an environment; and training an action selection neural network having policy parameters on the trajectories in the replay memory, wherein training the action selection neural network comprises: sampling a trajectory from the replay memory; and adjusting current values of the policy parameters by training the action selection neural network on the trajectory using an off-policy actor critic reinforcement learning technique.
-
公开(公告)号:US20200090048A1
公开(公告)日:2020-03-19
申请号:US16689020
申请日:2019-11-19
Applicant: DeepMind Technologies Limited
Inventor: Razvan Pascanu , Raia Thais Hadsell , Victor Constant Bapst , Wojciech Czarnecki , James Kirkpatrick , Yee Whye Teh , Nicolas Manfred Otto Heess
Abstract: A method is proposed for training a multitask computer system, such as a multitask neural network system. The system comprises a set of trainable workers and a shared module. The trainable workers and shared module are trained on a plurality of different tasks, such that each worker learns to perform a corresponding one of the tasks according to a respective task policy, and said shared policy network learns a multitask policy which represents common behavior for the tasks. The coordinated training is performed by optimizing an objective function comprising, for each task: a reward term indicative of an expected reward earned by a worker in performing the corresponding task according to the task policy; and at least one entropy term which regularizes the distribution of the task policy towards the distribution of the multitask policy.
-
公开(公告)号:US20190258918A1
公开(公告)日:2019-08-22
申请号:US16402687
申请日:2019-05-03
Applicant: DeepMind Technologies Limited
Inventor: Ziyu Wang , Nicolas Manfred Otto Heess , Victor Constant Bapst
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network. One of the methods includes maintaining a replay memory that stores trajectories generated as a result of interaction of an agent with an environment; and training an action selection neural network having policy parameters on the trajectories in the replay memory, wherein training the action selection neural network comprises: sampling a trajectory from the replay memory; and adjusting current values of the policy parameters by training the action selection neural network on the trajectory using an off-policy actor critic reinforcement learning technique.
-
-
-
-
-
-
-
-
-