-
公开(公告)号:US11334792B2
公开(公告)日:2022-05-17
申请号:US16403388
申请日:2019-05-03
Applicant: DeepMind Technologies Limited
Inventor: Volodymyr Mnih , Adria Puigdomenech Badia , Alexander Benjamin Graves , Timothy James Alexander Harley , David Silver , Koray Kavukcuoglu
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for asynchronous deep reinforcement learning. One of the systems includes a plurality of workers, wherein each worker is configured to operate independently of each other worker, and wherein each worker is associated with a respective actor that interacts with a respective replica of the environment during the training of the deep neural network.
-
12.
公开(公告)号:US11868894B2
公开(公告)日:2024-01-09
申请号:US18149771
申请日:2023-01-04
Applicant: DeepMind Technologies Limited
Inventor: Hubert Josef Soyer , Lasse Espeholt , Karen Simonyan , Yotam Doron , Vlad Firoiu , Volodymyr Mnih , Koray Kavukcuoglu , Remi Munos , Thomas Ward , Timothy James Alexander Harley , Iain Robert Dunning
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.
-
公开(公告)号:US20210097443A1
公开(公告)日:2021-04-01
申请号:US16586236
申请日:2019-09-27
Applicant: DeepMind Technologies Limited
Inventor: Ang Li , Valentin Clement Dalibard , David Budden , Ola Spyra , Maxwell Elliot Jaderberg , Timothy James Alexander Harley , Sagi Perel , Chenjie Gu , Pramod Gupta
IPC: G06N20/20 , G06N5/04 , G06F16/901
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a machine learning model. A method includes: maintaining a plurality of training sessions; assigning, to each worker of one or more workers, a respective training session of the plurality of training sessions; repeatedly performing operations until meeting one or more termination criteria, the operations comprising: receiving an updated training session from a respective worker of the one or more workers, selecting a second training session, selecting, based on comparing the updated training session and the second training session using a fitness evaluation function, either the updated training session or the second training session as a parent training session, generating a child training session from the selected parent training session, and assigning the child training session to an available worker, and selecting a candidate model to be a trained model for the machine learning model.
-
公开(公告)号:US10936946B2
公开(公告)日:2021-03-02
申请号:US15349950
申请日:2016-11-11
Applicant: DeepMind Technologies Limited
Inventor: Volodymyr Mnih , Adrià Puigdomènech Badia , Alexander Benjamin Graves , Timothy James Alexander Harley , David Silver , Koray Kavukcuoglu
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for asynchronous deep reinforcement learning. One of the systems includes a plurality of workers, wherein each worker is configured to operate independently of each other worker, and wherein each worker is associated with a respective actor that interacts with a respective replica of the environment during the training of the deep neural network.
-
公开(公告)号:US20190258929A1
公开(公告)日:2019-08-22
申请号:US16403388
申请日:2019-05-03
Applicant: DeepMind Technologies Limited
Inventor: Volodymyr Mnih , Adria Puigdomenech Badia , Alexander Benjamin Graves , Timothy James Alexander Harley , David Silver , Koray Kavukcuoglu
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for asynchronous deep reinforcement learning. One of the systems includes a plurality of workers, wherein each worker is configured to operate independently of each other worker, and wherein each worker is associated with a respective actor that interacts with a respective replica of the environment during the training of the deep neural network.
-
16.
公开(公告)号:US12299574B2
公开(公告)日:2025-05-13
申请号:US18487428
申请日:2023-10-16
Applicant: DeepMind Technologies Limited
Inventor: Hubert Josef Soyer , Lasse Espeholt , Karen Simonyan , Yotam Doron , Vlad Firoiu , Volodymyr Mnih , Koray Kavukcuoglu , Remi Munos , Thomas Ward , Timothy James Alexander Harley , Iain Robert Dunning
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.
-
公开(公告)号:US20240362481A1
公开(公告)日:2024-10-31
申请号:US18662481
申请日:2024-05-13
Applicant: DeepMind Technologies Limited
Inventor: Volodymyr Mnih , Adrià Puigdomènech Badia , Alexander Benjamin Graves , Timothy James Alexander Harley , David Silver , Koray Kavukcuoglu
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for asynchronous deep reinforcement learning. One of the systems includes a plurality of workers, wherein each worker is configured to operate independently of each other worker, and wherein each worker is associated with a respective actor that interacts with a respective replica of the environment during the training of the deep neural network.
-
18.
公开(公告)号:US20240127060A1
公开(公告)日:2024-04-18
申请号:US18487428
申请日:2023-10-16
Applicant: DeepMind Technologies Limited
Inventor: Hubert Josef Soyer , Lasse Espeholt , Karen Simonyan , Yotam Doron , Vlad Firoiu , Volodymyr Mnih , Koray Kavukcuoglu , Remi Munos , Thomas Ward , Timothy James Alexander Harley , Iain Robert Dunning
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.
-
19.
公开(公告)号:US20230153617A1
公开(公告)日:2023-05-18
申请号:US18149771
申请日:2023-01-04
Applicant: DeepMind Technologies Limited
Inventor: Hubert Josef Soyer , Lasse Espeholt , Karen Simonyan , Yotam Doron , Vlad Firoiu , Volodymyr Mnih , Koray Kavukcuoglu , Remi Munos , Thomas Ward , Timothy James Alexander Harley , Iain Robert Dunning
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.
-
公开(公告)号:US20210166127A1
公开(公告)日:2021-06-03
申请号:US17170316
申请日:2021-02-08
Applicant: DeepMind Technologies Limited
Inventor: Volodymyr Mnih , Adrià Puigdomènech Badia , Alexander Benjamin Graves , Timothy James Alexander Harley , David Silver , Koray Kavukcuoglu
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for asynchronous deep reinforcement learning. One of the systems includes a plurality of workers, wherein each worker is configured to operate independently of each other worker, and wherein each worker is associated with a respective actor that interacts with a respective replica of the environment during the training of the deep neural network.
-
-
-
-
-
-
-
-
-