-
公开(公告)号:US20230079338A1
公开(公告)日:2023-03-16
申请号:US17766854
申请日:2020-10-08
Applicant: DeepMind Technologies Limited
Inventor: Eren Sezener , Joel William Veness , Marcus Hutter , Jianan Wang , David Budden
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-readable storage media, for training a neural network to control a real-world agent interacting with a real-world environment to cause the real-world agent to perform a particular task. One of the methods includes training the neural network to determine first values of the parameters by optimizing a first task-specific objective that measures a performance of the policy neural network in controlling a simulated version of the real-world agent; obtaining real-world data generated from interactions of the real-world agent with the real-world environment; and training the neural network to determine trained values of the parameters from the first values of the parameters by jointly optimizing (i) a self-supervised objective that measures at least a performance of internal representations generated by the neural network on a self-supervised task performed on the real-world data and (ii) a second task-specific objective.
-
公开(公告)号:US20210097443A1
公开(公告)日:2021-04-01
申请号:US16586236
申请日:2019-09-27
Applicant: DeepMind Technologies Limited
Inventor: Ang Li , Valentin Clement Dalibard , David Budden , Ola Spyra , Maxwell Elliot Jaderberg , Timothy James Alexander Harley , Sagi Perel , Chenjie Gu , Pramod Gupta
IPC: G06N20/20 , G06N5/04 , G06F16/901
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a machine learning model. A method includes: maintaining a plurality of training sessions; assigning, to each worker of one or more workers, a respective training session of the plurality of training sessions; repeatedly performing operations until meeting one or more termination criteria, the operations comprising: receiving an updated training session from a respective worker of the one or more workers, selecting a second training session, selecting, based on comparing the updated training session and the second training session using a fitness evaluation function, either the updated training session or the second training session as a parent training session, generating a child training session from the selected parent training session, and assigning the child training session to an available worker, and selecting a candidate model to be a trained model for the machine learning model.
-
公开(公告)号:US20230252288A1
公开(公告)日:2023-08-10
申请号:US18131753
申请日:2023-04-06
Applicant: DeepMind Technologies Limited
Inventor: David Budden , Gabriel Barth-Maron , John Quan , Daniel George Horgan
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. One of the systems includes (i) a plurality of actor computing units, in which each of the actor computing units is configured to maintain a respective replica of the action selection neural network and to perform a plurality of actor operations, and (ii) one or more learner computing units, in which each of the one or more learner computing units is configured to perform a plurality of learner operations.
-
公开(公告)号:US11625604B2
公开(公告)日:2023-04-11
申请号:US16641751
申请日:2018-10-29
Applicant: DeepMind Technologies Limited
Inventor: David Budden , Gabriel Barth-Maron , John Quan , Daniel George Horgan
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. One of the systems includes (i) a plurality of actor computing units, in which each of the actor computing units is configured to maintain a respective replica of the action selection neural network and to perform a plurality of actor operations, and (ii) one or more learner computing units, in which each of the one or more learner computing units is configured to perform a plurality of learner operations.
-
公开(公告)号:US20200265305A1
公开(公告)日:2020-08-20
申请号:US16641751
申请日:2018-10-29
Applicant: DeepMind Technologies Limited
Inventor: David Budden , Gabriel Barth-Maron , John Quan , Daniel George Horgan
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. One of the systems includes (i) a plurality of actor computing units, in which each of the actor computing units is configured to maintain a respective replica of the action selection neural network and to perform a plurality of actor operations, and (ii) one or more learner computing units, in which each of the one or more learner computing units is configured to perform a plurality of learner operations.
-
公开(公告)号:US11907821B2
公开(公告)日:2024-02-20
申请号:US16586236
申请日:2019-09-27
Applicant: DeepMind Technologies Limited
Inventor: Ang Li , Valentin Clement Dalibard , David Budden , Ola Spyra , Maxwell Elliot Jaderberg , Timothy James Alexander Harley , Sagi Perel , Chenjie Gu , Pramod Gupta
IPC: G06N20/20 , G06F16/901 , G06N5/04
CPC classification number: G06N20/20 , G06F16/9024 , G06N5/04
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a machine learning model. A method includes: maintaining a plurality of training sessions; assigning, to each worker of one or more workers, a respective training session of the plurality of training sessions; repeatedly performing operations until meeting one or more termination criteria, the operations comprising: receiving an updated training session from a respective worker of the one or more workers, selecting a second training session, selecting, based on comparing the updated training session and the second training session using a fitness evaluation function, either the updated training session or the second training session as a parent training session, generating a child training session from the selected parent training session, and assigning the child training session to an available worker, and selecting a candidate model to be a trained model for the machine learning model.
-
公开(公告)号:US11481629B2
公开(公告)日:2022-10-25
申请号:US16759519
申请日:2018-10-29
Applicant: DeepMind Technologies Limited
Inventor: David Budden , Matthew William Hoffman , Gabriel Barth-Maron
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by a reinforcement learning agent interacting with an environment. In particular, the actions are selected from a continuous action space and the system trains the action selection neural network jointly with a distribution Q network that is used to update the parameters of the action selection neural network.
-
公开(公告)号:US11948085B2
公开(公告)日:2024-04-02
申请号:US18303117
申请日:2023-04-19
Applicant: DeepMind Technologies Limited
Inventor: David Budden , Matthew William Hoffman , Gabriel Barth-Maron
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by a reinforcement learning agent interacting with an environment. In particular, the actions are selected from a continuous action space and the system trains the action selection neural network jointly with a distribution Q network that is used to update the parameters of the action selection neural network.
-
公开(公告)号:US20230409907A1
公开(公告)日:2023-12-21
申请号:US18303117
申请日:2023-04-19
Applicant: Deepmind Technologies Limited
Inventor: David Budden , Matthew William Hoffman , Gabriel Barth-Maron
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by a reinforcement learning agent interacting with an environment. In particular, the actions are selected from a continuous action space and the system trains the action selection neural network jointly with a distribution Q network that is used to update the parameters of the action selection neural network.
-
公开(公告)号:US20230020071A1
公开(公告)日:2023-01-19
申请号:US17945622
申请日:2022-09-15
Applicant: DeepMind Technologies Limited
Inventor: David Budden , Matthew William Hoffman , Gabriel Barth-Maron
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by a reinforcement learning agent interacting with an environment. In particular, the actions are selected from a continuous action space and the system trains the action selection neural network jointly with a distribution Q network that is used to update the parameters of the action selection neural network.
-
-
-
-
-
-
-
-
-