Patent search ap:("DeepMind Technologies Limited") AND inv:"Sergio Gomez Colmenarejo" Page 2

11.

发明公开
DATA-DRIVEN ROBOT CONTROL 审中-公开

公开(公告)号：US20240042600A1

公开(公告)日：2024-02-08

申请号：US18331632

申请日：2023-06-08

Applicant: DeepMind Technologies Limited

Inventor： Serkan Cabi , Ziyu Wang , Alexander Novikov , Ksenia Konyushkova , Sergio Gomez Colmenarejo , Scott Ellison Reed , Misha Man Ray Denil , Jonathan Karl Scholz , Oleg O. Sushkov , Rae Chan Jeong , David Barker , David Budden , Mel Vecerik , Yusuf Aytar , Joao Ferdinando Gomes de Freitas

IPC: B25J9/16

CPC classification number: B25J9/161 , B25J9/163 , B25J9/1661

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-driven robotic control. One of the methods includes maintaining robot experience data; obtaining annotation data; training, on the annotation data, a reward model; generating task-specific training data for the particular task, comprising, for each experience in a second subset of the experiences in the robot experience data: processing the observation in the experience using the trained reward model to generate a reward prediction, and associating the reward prediction with the experience; and training a policy neural network on the task-specific training data for the particular task, wherein the policy neural network is configured to receive a network input comprising an observation and to generate a policy output that defines a control policy for a robot performing the particular task.

12.

发明授权
Data-driven robot control 有权

公开(公告)号：US11712799B2

公开(公告)日：2023-08-01

申请号：US17020294

申请日：2020-09-14

Applicant: DeepMind Technologies Limited

Inventor： Serkan Cabi , Ziyu Wang , Alexander Novikov , Ksenia Konyushkova , Sergio Gomez Colmenarejo , Scott Ellison Reed , Misha Man Ray Denil , Jonathan Karl Scholz , Oleg O. Sushkov , Rae Chan Jeong , David Barker , David Budden , Mel Vecerik , Yusuf Aytar , Joao Ferdinando Gomes de Freitas

IPC: B25J9/16

CPC classification number: B25J9/161 , B25J9/163 , B25J9/1661

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-driven robotic control. One of the methods includes maintaining robot experience data; obtaining annotation data; training, on the annotation data, a reward model; generating task-specific training data for the particular task, comprising, for each experience in a second subset of the experiences in the robot experience data: processing the observation in the experience using the trained reward model to generate a reward prediction, and associating the reward prediction with the experience; and training a policy neural network on the task-specific training data for the particular task, wherein the policy neural network is configured to receive a network input comprising an observation and to generate a policy output that defines a control policy for a robot performing the particular task.

13.

发明授权
Action selection neural network training using imitation learning in latent space 有权

公开(公告)号：US11663441B2

公开(公告)日：2023-05-30

申请号：US16586437

申请日：2019-09-27

Applicant: DeepMind Technologies Limited

Inventor： Scott Ellison Reed , Yusuf Aytar , Ziyu Wang , Tom Paine , Sergio Gomez Colmenarejo , David Budden , Tobias Pfaff , Aaron Gerard Antonius van den Oord , Oriol Vinyals , Alexander Novikov

IPC: G06N3/006 , G06F17/16 , G06N3/08 , G06F18/22 , G06N3/045 , G06N3/048 , G06V10/764 , G06V10/77 , G06V10/82

CPC classification number: G06N3/006 , G06F17/16 , G06F18/22 , G06N3/045 , G06N3/048 , G06N3/08 , G06V10/764 , G06V10/7715 , G06V10/82

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection policy neural network, wherein the action selection policy neural network is configured to process an observation characterizing a state of an environment to generate an action selection policy output, wherein the action selection policy output is used to select an action to be performed by an agent interacting with an environment. In one aspect, a method comprises: obtaining an observation characterizing a state of the environment subsequent to the agent performing a selected action; generating a latent representation of the observation; processing the latent representation of the observation using a discriminator neural network to generate an imitation score; determining a reward from the imitation score; and adjusting the current values of the action selection policy neural network parameters based on the reward using a reinforcement learning training technique.

14.

发明申请
DATA-DRIVEN ROBOT CONTROL 有权

公开(公告)号：US20210078169A1

公开(公告)日：2021-03-18

申请号：US17020294

申请日：2020-09-14

Applicant: DeepMind Technologies Limited

Inventor： Serkan Cabi , Ziyu Wang , Alexander Novikov , Ksenia Konyushkova , Sergio Gomez Colmenarejo , Scott Ellison Reed , Misha Man Ray Denil , Jonathan Karl Scholz , Oleg O. Sushkov , Rae Chan Jeong , David Barker , David Budden , Mel Vecerik , Yusuf Aytar , Joao Ferdinando Gomes de Freitas

IPC: B25J9/16

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-driven robotic control. One of the methods includes maintaining robot experience data; obtaining annotation data; training, on the annotation data, a reward model; generating task-specific training data for the particular task, comprising, for each experience in a second subset of the experiences in the robot experience data: processing the observation in the experience using the trained reward model to generate a reward prediction, and associating the reward prediction with the experience; and training a policy neural network on the task-specific training data for the particular task, wherein the policy neural network is configured to receive a network input comprising an observation and to generate a policy output that defines a control policy for a robot performing the particular task.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification