Patent search ap:("DeepMind Technologies Limited") AND inv:"Yusuf Aytar" Page 1

1.

发明申请
HIERARCHICAL LATENT MIXTURE POLICIES FOR AGENT CONTROL 有权

公开(公告)号：US20240403652A1

公开(公告)日：2024-12-05

申请号：US18699012

申请日：2022-10-05

Applicant: DeepMind Technologies Limited

Inventor： Dushyant ` Rao , Fereshteh Sadeghi , Leonard Hasenclever , Markus Wulfmeier , Martina Zambelli , Giulia Vezzani , Dhruva Tirumala Bukkapatnam , Yusuf Aytar , Joshua Merel , Nicolas Manfred Otto Heess , Raia Thais Hadsell

IPC: G06N3/092

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling agents. In particular, an agent can be controlled using a hierarchical controller that includes a high-level controller neural network, a mid-level controller neural network, and a low-level controller neural network.

2.

发明公开
CROSS-DOMAIN IMITATION LEARNING USING GOAL CONDITIONED POLICIES 审中-公开

公开(公告)号：US20230330846A1

公开(公告)日：2023-10-19

申请号：US18028966

申请日：2021-10-01

Applicant: DeepMind Technologies Limited

Inventor： Yuxiang Zhou , Yusuf Aytar , Konstantinos Bousmalis

IPC: B25J9/16

CPC classification number: B25J9/163

Abstract: It is described a system implemented as computer programs on one or more computers in one or more locations that trains a policy neural network that is used to control a robot, i.e., to select actions to be performed by the robot while the robot is interacting with an environment, through imitation learning in order to cause the robot to perform particular tasks in the environment.

3.

发明申请
ALIGNING SEQUENCES BY GENERATING ENCODED REPRESENTATIONS OF DATA ITEMS 有权

公开(公告)号：US20220004883A1

公开(公告)日：2022-01-06

申请号：US17295286

申请日：2019-11-21

Applicant: DeepMind Technologies Limited

Inventor： Yusuf Aytar , Debidatta Dwibedi , Andrew Zisserman , Jonathan Tompson , Pierre Sermanet

IPC: G06N3/08 , G06K9/62 , G06T7/00

Abstract: An encoder neural network is described which can encode a data item, such as a frame of a video, to form a respective encoded data item. Data items of a first data sequence are associated with respective data items of a second sequence, by determining which of the encoded data items of the second sequence is closest to the encoded data item produced from each data item of the first sequence. Thus, the two data sequences are aligned. The encoder neural network is trained automatically using a training set of data sequences, by an iterative process of successively increasing cycle consistency between pairs of the data sequences.

4.

发明公开
ANIMATING IMAGES USING POINT TRAJECTORIES 审中-公开

公开(公告)号：US20240303897A1

公开(公告)日：2024-09-12

申请号：US18600552

申请日：2024-03-08

Applicant: DeepMind Technologies Limited

Inventor： Carl Doersch , Yi Yang , Mel Vecerik , Dilara Gokay , Ankush Gupta , Yusuf Aytar , Joao Carreira , Andrew Zisserman

IPC: G06T13/80 , G06T3/18 , G06T7/70 , G06T9/00

CPC classification number: G06T13/80 , G06T3/18 , G06T7/70 , G06T9/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for animating images using point trajectories.

5.

发明公开
DATA-DRIVEN ROBOT CONTROL 审中-公开

公开(公告)号：US20240042600A1

公开(公告)日：2024-02-08

申请号：US18331632

申请日：2023-06-08

Applicant: DeepMind Technologies Limited

Inventor： Serkan Cabi , Ziyu Wang , Alexander Novikov , Ksenia Konyushkova , Sergio Gomez Colmenarejo , Scott Ellison Reed , Misha Man Ray Denil , Jonathan Karl Scholz , Oleg O. Sushkov , Rae Chan Jeong , David Barker , David Budden , Mel Vecerik , Yusuf Aytar , Joao Ferdinando Gomes de Freitas

IPC: B25J9/16

CPC classification number: B25J9/161 , B25J9/163 , B25J9/1661

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-driven robotic control. One of the methods includes maintaining robot experience data; obtaining annotation data; training, on the annotation data, a reward model; generating task-specific training data for the particular task, comprising, for each experience in a second subset of the experiences in the robot experience data: processing the observation in the experience using the trained reward model to generate a reward prediction, and associating the reward prediction with the experience; and training a policy neural network on the task-specific training data for the particular task, wherein the policy neural network is configured to receive a network input comprising an observation and to generate a policy output that defines a control policy for a robot performing the particular task.

6.

发明授权
Data-driven robot control 有权

公开(公告)号：US11712799B2

公开(公告)日：2023-08-01

申请号：US17020294

申请日：2020-09-14

Applicant: DeepMind Technologies Limited

Inventor： Serkan Cabi , Ziyu Wang , Alexander Novikov , Ksenia Konyushkova , Sergio Gomez Colmenarejo , Scott Ellison Reed , Misha Man Ray Denil , Jonathan Karl Scholz , Oleg O. Sushkov , Rae Chan Jeong , David Barker , David Budden , Mel Vecerik , Yusuf Aytar , Joao Ferdinando Gomes de Freitas

IPC: B25J9/16

CPC classification number: B25J9/161 , B25J9/163 , B25J9/1661

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-driven robotic control. One of the methods includes maintaining robot experience data; obtaining annotation data; training, on the annotation data, a reward model; generating task-specific training data for the particular task, comprising, for each experience in a second subset of the experiences in the robot experience data: processing the observation in the experience using the trained reward model to generate a reward prediction, and associating the reward prediction with the experience; and training a policy neural network on the task-specific training data for the particular task, wherein the policy neural network is configured to receive a network input comprising an observation and to generate a policy output that defines a control policy for a robot performing the particular task.

7.

发明授权
Action selection neural network training using imitation learning in latent space 有权

公开(公告)号：US11663441B2

公开(公告)日：2023-05-30

申请号：US16586437

申请日：2019-09-27

Applicant: DeepMind Technologies Limited

Inventor： Scott Ellison Reed , Yusuf Aytar , Ziyu Wang , Tom Paine , Sergio Gomez Colmenarejo , David Budden , Tobias Pfaff , Aaron Gerard Antonius van den Oord , Oriol Vinyals , Alexander Novikov

IPC: G06N3/006 , G06F17/16 , G06N3/08 , G06F18/22 , G06N3/045 , G06N3/048 , G06V10/764 , G06V10/77 , G06V10/82

CPC classification number: G06N3/006 , G06F17/16 , G06F18/22 , G06N3/045 , G06N3/048 , G06N3/08 , G06V10/764 , G06V10/7715 , G06V10/82

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection policy neural network, wherein the action selection policy neural network is configured to process an observation characterizing a state of an environment to generate an action selection policy output, wherein the action selection policy output is used to select an action to be performed by an agent interacting with an environment. In one aspect, a method comprises: obtaining an observation characterizing a state of the environment subsequent to the agent performing a selected action; generating a latent representation of the observation; processing the latent representation of the observation using a discriminator neural network to generate an imitation score; determining a reward from the imitation score; and adjusting the current values of the action selection policy neural network parameters based on the reward using a reinforcement learning training technique.

8.

发明申请
DOMAIN ADAPTATION FOR ROBOTIC CONTROL USING SELF-SUPERVISED LEARNING 有权

公开(公告)号：US20210103815A1

公开(公告)日：2021-04-08

申请号：US17065489

申请日：2020-10-07

Applicant: DeepMind Technologies Limited

Inventor： Rae Chan Jeong , Yusuf Aytar , David Khosid , Yuxiang Zhou , Jacqueline Ok-chan Kay , Thomas Lampe , Konstantinos Bousmalis , Francesco Nori

IPC: G06N3/08 , G05B19/4155

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a policy neural network for use in controlling a real-world agent in a real-world environment. One of the methods includes training the policy neural network by optimizing a first task-specific objective that measures a performance of the policy neural network in controlling a simulated version of the real-world agent; and then training the policy neural network by jointly optimizing (i) a self-supervised objective that measures at least a performance of internal representations generated by the policy neural network on a self-supervised task performed on real-world data and (ii) a second task-specific objective that measures the performance of the policy neural network in controlling the simulated version of the real-world agent.

9.

发明申请
DATA-DRIVEN ROBOT CONTROL 有权

公开(公告)号：US20210078169A1

公开(公告)日：2021-03-18

申请号：US17020294

申请日：2020-09-14

Applicant: DeepMind Technologies Limited

Inventor： Serkan Cabi , Ziyu Wang , Alexander Novikov , Ksenia Konyushkova , Sergio Gomez Colmenarejo , Scott Ellison Reed , Misha Man Ray Denil , Jonathan Karl Scholz , Oleg O. Sushkov , Rae Chan Jeong , David Barker , David Budden , Mel Vecerik , Yusuf Aytar , Joao Ferdinando Gomes de Freitas

IPC: B25J9/16

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-driven robotic control. One of the methods includes maintaining robot experience data; obtaining annotation data; training, on the annotation data, a reward model; generating task-specific training data for the particular task, comprising, for each experience in a second subset of the experiences in the robot experience data: processing the observation in the experience using the trained reward model to generate a reward prediction, and associating the reward prediction with the experience; and training a policy neural network on the task-specific training data for the particular task, wherein the policy neural network is configured to receive a network input comprising an observation and to generate a policy output that defines a control policy for a robot performing the particular task.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification