-
1.
公开(公告)号:US20240220774A1
公开(公告)日:2024-07-04
申请号:US18536065
申请日:2023-12-11
Applicant: DeepMind Technologies Limited
Inventor: Iain Robert Dunning , Wojciech Czarnecki , Maxwell Elliot Jaderberg
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for reinforcement learning. One of the methods includes selecting an action to be performed by the agent using both a slow updating recurrent neural network and a fast updating recurrent neural network that receives a fast updating input that includes the hidden state of the slow updating recurrent neural network.
-
2.
公开(公告)号:US11868894B2
公开(公告)日:2024-01-09
申请号:US18149771
申请日:2023-01-04
Applicant: DeepMind Technologies Limited
Inventor: Hubert Josef Soyer , Lasse Espeholt , Karen Simonyan , Yotam Doron , Vlad Firoiu , Volodymyr Mnih , Koray Kavukcuoglu , Remi Munos , Thomas Ward , Timothy James Alexander Harley , Iain Robert Dunning
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.
-
公开(公告)号:US11842261B2
公开(公告)日:2023-12-12
申请号:US17121679
申请日:2020-12-14
Applicant: DeepMind Technologies Limited
Inventor: Iain Robert Dunning , Wojciech Czarnecki , Maxwell Elliot Jaderberg
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for reinforcement learning. One of the methods includes selecting an action to be performed by the agent using both a slow updating recurrent neural network and a fast updating recurrent neural network that receives a fast updating input that includes the hidden state of the slow updating recurrent neural network.
-
4.
公开(公告)号:US11593646B2
公开(公告)日:2023-02-28
申请号:US16767049
申请日:2019-02-05
Applicant: DeepMind Technologies Limited
Inventor: Hubert Josef Soyer , Lasse Espeholt , Karen Simonyan , Yotam Doron , Vlad Firoiu , Volodymyr Mnih , Koray Kavukcuoglu , Remi Munos , Thomas Ward , Timothy James Alexander Harley , Iain Robert Dunning
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.
-
公开(公告)号:US20210097373A1
公开(公告)日:2021-04-01
申请号:US17121679
申请日:2020-12-14
Applicant: DeepMind Technologies Limited
Inventor: Iain Robert Dunning , Wojciech Czarnecki , Maxwell Elliot Jaderberg
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for reinforcement learning. One of the methods includes selecting an action to be performed by the agent using both a slow updating recurrent neural network and a fast updating recurrent neural network that receives a fast updating input that includes the hidden state of the slow updating recurrent neural network.
-
6.
公开(公告)号:US12299574B2
公开(公告)日:2025-05-13
申请号:US18487428
申请日:2023-10-16
Applicant: DeepMind Technologies Limited
Inventor: Hubert Josef Soyer , Lasse Espeholt , Karen Simonyan , Yotam Doron , Vlad Firoiu , Volodymyr Mnih , Koray Kavukcuoglu , Remi Munos , Thomas Ward , Timothy James Alexander Harley , Iain Robert Dunning
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.
-
7.
公开(公告)号:US20240127060A1
公开(公告)日:2024-04-18
申请号:US18487428
申请日:2023-10-16
Applicant: DeepMind Technologies Limited
Inventor: Hubert Josef Soyer , Lasse Espeholt , Karen Simonyan , Yotam Doron , Vlad Firoiu , Volodymyr Mnih , Koray Kavukcuoglu , Remi Munos , Thomas Ward , Timothy James Alexander Harley , Iain Robert Dunning
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.
-
8.
公开(公告)号:US20230153617A1
公开(公告)日:2023-05-18
申请号:US18149771
申请日:2023-01-04
Applicant: DeepMind Technologies Limited
Inventor: Hubert Josef Soyer , Lasse Espeholt , Karen Simonyan , Yotam Doron , Vlad Firoiu , Volodymyr Mnih , Koray Kavukcuoglu , Remi Munos , Thomas Ward , Timothy James Alexander Harley , Iain Robert Dunning
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.
-
9.
公开(公告)号:US20210034970A1
公开(公告)日:2021-02-04
申请号:US16767049
申请日:2019-02-05
Applicant: DeepMind Technologies Limited
Inventor: Hubert Josef Soyer , Lasse Espeholt , Karen Simonyan , Yotam Doron , Vlad Firoiu , Volodymyr Mnih , Koray Kavukcuoglu , Remi Munos , Thomas Ward , Timothy James Alexander Harley , Iain Robert Dunning
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.
-
-
-
-
-
-
-
-