Patent search ap:("Deepmind Technologies Limited") AND inv:"David Budden" Page 1

1.

发明申请
GATED LINEAR CONTEXTUAL BANDITS 有权

公开(公告)号：US20230079338A1

公开(公告)日：2023-03-16

申请号：US17766854

申请日：2020-10-08

Applicant: DeepMind Technologies Limited

Inventor： Eren Sezener , Joel William Veness , Marcus Hutter , Jianan Wang , David Budden

IPC: G06N3/00 , G06N3/063

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-readable storage media, for training a neural network to control a real-world agent interacting with a real-world environment to cause the real-world agent to perform a particular task. One of the methods includes training the neural network to determine first values of the parameters by optimizing a first task-specific objective that measures a performance of the policy neural network in controlling a simulated version of the real-world agent; obtaining real-world data generated from interactions of the real-world agent with the real-world environment; and training the neural network to determine trained values of the parameters from the first values of the parameters by jointly optimizing (i) a self-supervised objective that measures at least a performance of internal representations generated by the neural network on a self-supervised task performed on the real-world data and (ii) a second task-specific objective.

2.

发明申请
POPULATION-BASED TRAINING OF MACHINE LEARNING MODELS 有权

公开(公告)号：US20210097443A1

公开(公告)日：2021-04-01

申请号：US16586236

申请日：2019-09-27

Applicant: DeepMind Technologies Limited

Inventor： Ang Li , Valentin Clement Dalibard , David Budden , Ola Spyra , Maxwell Elliot Jaderberg , Timothy James Alexander Harley , Sagi Perel , Chenjie Gu , Pramod Gupta

IPC: G06N20/20 , G06N5/04 , G06F16/901

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a machine learning model. A method includes: maintaining a plurality of training sessions; assigning, to each worker of one or more workers, a respective training session of the plurality of training sessions; repeatedly performing operations until meeting one or more termination criteria, the operations comprising: receiving an updated training session from a respective worker of the one or more workers, selecting a second training session, selecting, based on comparing the updated training session and the second training session using a fitness evaluation function, either the updated training session or the second training session as a parent training session, generating a child training session from the selected parent training session, and assigning the child training session to an available worker, and selecting a candidate model to be a trained model for the machine learning model.

3.

发明公开
REINFORCEMENT LEARNING USING DISTRIBUTED PRIORITIZED REPLAY 审中-公开

公开(公告)号：US20230252288A1

公开(公告)日：2023-08-10

申请号：US18131753

申请日：2023-04-06

Applicant: DeepMind Technologies Limited

Inventor： David Budden , Gabriel Barth-Maron , John Quan , Daniel George Horgan

IPC: G06N3/08 , G06N20/00 , G06N3/088 , G06N3/04

CPC classification number: G06N3/08 , G06N20/00 , G06N3/088 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. One of the systems includes (i) a plurality of actor computing units, in which each of the actor computing units is configured to maintain a respective replica of the action selection neural network and to perform a plurality of actor operations, and (ii) one or more learner computing units, in which each of the one or more learner computing units is configured to perform a plurality of learner operations.

4.

发明授权
Reinforcement learning using distributed prioritized replay 有权

公开(公告)号：US11625604B2

公开(公告)日：2023-04-11

申请号：US16641751

申请日：2018-10-29

Applicant: DeepMind Technologies Limited

Inventor： David Budden , Gabriel Barth-Maron , John Quan , Daniel George Horgan

IPC: G06N3/08 , G06N3/04 , G06N20/00 , G06N3/088

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. One of the systems includes (i) a plurality of actor computing units, in which each of the actor computing units is configured to maintain a respective replica of the action selection neural network and to perform a plurality of actor operations, and (ii) one or more learner computing units, in which each of the one or more learner computing units is configured to perform a plurality of learner operations.

5.

发明申请
REINFORCEMENT LEARNING USING DISTRIBUTED PRIORITIZED REPLAY 审中-公开

公开(公告)号：US20200265305A1

公开(公告)日：2020-08-20

申请号：US16641751

申请日：2018-10-29

Applicant: DeepMind Technologies Limited

Inventor： David Budden , Gabriel Barth-Maron , John Quan , Daniel George Horgan

IPC: G06N3/08 , G06N20/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. One of the systems includes (i) a plurality of actor computing units, in which each of the actor computing units is configured to maintain a respective replica of the action selection neural network and to perform a plurality of actor operations, and (ii) one or more learner computing units, in which each of the one or more learner computing units is configured to perform a plurality of learner operations.

6.

发明授权
Population-based training of machine learning models 有权

公开(公告)号：US11907821B2

公开(公告)日：2024-02-20

申请号：US16586236

申请日：2019-09-27

Applicant: DeepMind Technologies Limited

Inventor： Ang Li , Valentin Clement Dalibard , David Budden , Ola Spyra , Maxwell Elliot Jaderberg , Timothy James Alexander Harley , Sagi Perel , Chenjie Gu , Pramod Gupta

IPC: G06N20/20 , G06F16/901 , G06N5/04

CPC classification number: G06N20/20 , G06F16/9024 , G06N5/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a machine learning model. A method includes: maintaining a plurality of training sessions; assigning, to each worker of one or more workers, a respective training session of the plurality of training sessions; repeatedly performing operations until meeting one or more termination criteria, the operations comprising: receiving an updated training session from a respective worker of the one or more workers, selecting a second training session, selecting, based on comparing the updated training session and the second training session using a fitness evaluation function, either the updated training session or the second training session as a parent training session, generating a child training session from the selected parent training session, and assigning the child training session to an available worker, and selecting a candidate model to be a trained model for the machine learning model.

7.

发明授权
Distributional reinforcement learning for continuous control tasks 有权

公开(公告)号：US11481629B2

公开(公告)日：2022-10-25

申请号：US16759519

申请日：2018-10-29

Applicant: DeepMind Technologies Limited

Inventor： David Budden , Matthew William Hoffman , Gabriel Barth-Maron

IPC: G06N3/08 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by a reinforcement learning agent interacting with an environment. In particular, the actions are selected from a continuous action space and the system trains the action selection neural network jointly with a distribution Q network that is used to update the parameters of the action selection neural network.

8.

发明授权
Distributional reinforcement learning for continuous control tasks 有权

公开(公告)号：US11948085B2

公开(公告)日：2024-04-02

申请号：US18303117

申请日：2023-04-19

Applicant: DeepMind Technologies Limited

Inventor： David Budden , Matthew William Hoffman , Gabriel Barth-Maron

IPC: G06N3/08 , G06N3/045

CPC classification number: G06N3/08 , G06N3/045

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by a reinforcement learning agent interacting with an environment. In particular, the actions are selected from a continuous action space and the system trains the action selection neural network jointly with a distribution Q network that is used to update the parameters of the action selection neural network.

9.

发明公开
DISTRIBUTIONAL REINFORCEMENT LEARNING FOR CONTINUOUS CONTROL TASKS 审中-公开

公开(公告)号：US20230409907A1

公开(公告)日：2023-12-21

申请号：US18303117

申请日：2023-04-19

Applicant: Deepmind Technologies Limited

Inventor： David Budden , Matthew William Hoffman , Gabriel Barth-Maron

IPC: G06N3/08 , G06N3/045

CPC classification number: G06N3/08 , G06N3/045

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by a reinforcement learning agent interacting with an environment. In particular, the actions are selected from a continuous action space and the system trains the action selection neural network jointly with a distribution Q network that is used to update the parameters of the action selection neural network.

10.

发明申请
DISTRIBUTIONAL REINFORCEMENT LEARNING FOR CONTINUOUS CONTROL TASKS 有权

公开(公告)号：US20230020071A1

公开(公告)日：2023-01-19

申请号：US17945622

申请日：2022-09-15

Applicant: DeepMind Technologies Limited

Inventor： David Budden , Matthew William Hoffman , Gabriel Barth-Maron

IPC: G06N3/08 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by a reinforcement learning agent interacting with an environment. In particular, the actions are selected from a continuous action space and the system trains the action selection neural network jointly with a distribution Q network that is used to update the parameters of the action selection neural network.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification