Patent search ap:("DeepMind Technologies Limited") AND inv:"Yuval Tassa" Page 1

1.

发明授权
Continuous control with deep reinforcement learning 有权

公开(公告)号：US11803750B2

公开(公告)日：2023-10-31

申请号：US17019927

申请日：2020-09-14

Applicant: DeepMind Technologies Limited

Inventor： Timothy Paul Lillicrap , Jonathan James Hunt , Alexander Pritzel , Nicolas Manfred Otto Heess , Tom Erez , Yuval Tassa , David Silver , Daniel Pieter Wierstra

IPC: G06N3/08 , G06N3/006 , G06N3/084 , G06N3/045

CPC classification number: G06N3/08 , G06N3/006 , G06N3/045 , G06N3/084

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an actor neural network used to select actions to be performed by an agent interacting with an environment. One of the methods includes obtaining a minibatch of experience tuples; and updating current values of the parameters of the actor neural network, comprising: for each experience tuple in the minibatch: processing the training observation and the training action in the experience tuple using a critic neural network to determine a neural network output for the experience tuple, and determining a target neural network output for the experience tuple; updating current values of the parameters of the critic neural network using errors between the target neural network outputs and the neural network outputs; and updating the current values of the parameters of the actor neural network using the critic neural network.

2.

发明授权
Selecting reinforcement learning actions using a low-level controller 有权

公开(公告)号：US11875258B1

公开(公告)日：2024-01-16

申请号：US17541186

申请日：2021-12-02

Applicant: DeepMind Technologies Limited

Inventor： Nicolas Manfred Otto Heess , Timothy Paul Lillicrap , Gregory Duncan Wayne , Yuval Tassa

IPC: G06N3/08 , G06N3/006 , G06N3/044 , G06N3/045

CPC classification number: G06N3/08 , G06N3/006 , G06N3/044 , G06N3/045

Abstract: Methods, systems, and apparatus for selecting actions to be performed by an agent interacting with an environment. One system includes a high-level controller neural network, low-level controller network, and subsystem. The high-level controller neural network receives an input observation and processes the input observation to generate a high-level output defining a control signal for the low-level controller. The low-level controller neural network receives a designated component of an input observation and processes the designated component and an input control signal to generate a low-level output that defines an action to be performed by the agent in response to the input observation. The subsystem receives a current observation characterizing a current state of the environment, determines whether criteria are satisfied for generating a new control signal, and based on the determination, provides appropriate inputs to the high-level and low-level controllers for selecting an action to be performed by the agent.

3.

发明公开
CONTINUOUS CONTROL WITH DEEP REINFORCEMENT LEARNING 审中-公开

公开(公告)号：US20240177002A1

公开(公告)日：2024-05-30

申请号：US18497931

申请日：2023-10-30

Applicant: DeepMind Technologies Limited

Inventor： Timothy Paul Lillicrap , Jonathan James Hunt , Alexander Pritzel , Nicolas Manfred Otto Heess , Tom Erez , Yuval Tassa , David Silver , Daniel Pieter Wierstra

IPC: G06N3/08 , G06N3/006 , G06N3/045 , G06N3/084

CPC classification number: G06N3/08 , G06N3/006 , G06N3/045 , G06N3/084

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an actor neural network used to select actions to be performed by an agent interacting with an environment. One of the methods includes obtaining a minibatch of experience tuples; and updating current values of the parameters of the actor neural network, comprising: for each experience tuple in the minibatch: processing the training observation and the training action in the experience tuple using a critic neural network to determine a neural network output for the experience tuple, and determining a target neural network output for the experience tuple; updating current values of the parameters of the critic neural network using errors between the target neural network outputs and the neural network outputs; and updating the current values of the parameters of the actor neural network using the critic neural network.

4.

发明授权
Selecting reinforcement learning actions using a low-level controller 有权

公开(公告)号：US11210585B1

公开(公告)日：2021-12-28

申请号：US15594228

申请日：2017-05-12

Applicant: DeepMind Technologies Limited

Inventor： Nicolas Manfred Otto Heess , Timothy Paul Lillicrap , Gregory Duncan Wayne , Yuval Tassa

IPC: G06N3/08 , G06N3/00

Abstract: Methods, systems, and apparatus for selecting actions to be performed by an agent interacting with an environment. One system includes a high-level controller neural network, low-level controller network, and subsystem. The high-level controller neural network receives an input observation and processes the input observation to generate a high-level output defining a control signal for the low-level controller. The low-level controller neural network receives a designated component of an input observation and processes the designated component and an input control signal to generate a low-level output that defines an action to be performed by the agent in response to the input observation. The subsystem receives a current observation characterizing a current state of the environment, determines whether criteria are satisfied for generating a new control signal, and based on the determination, provides appropriate inputs to the high-level and low-level controllers for selecting an action to be performed by the agent.

5.

发明授权
Continuous control with deep reinforcement learning 有权

公开(公告)号：US10776692B2

公开(公告)日：2020-09-15

申请号：US15217758

申请日：2016-07-22

Applicant: DeepMind Technologies Limited

Inventor： Timothy Paul Lillicrap , Jonathan James Hunt , Alexander Pritzel , Nicolas Manfred Otto Heess , Tom Erez , Yuval Tassa , David Silver , Daniel Pieter Wierstra

IPC: G06N3/08 , G06N3/00 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an actor neural network used to select actions to be performed by an agent interacting with an environment. One of the methods includes obtaining a minibatch of experience tuples; and updating current values of the parameters of the actor neural network, comprising: for each experience tuple in the minibatch: processing the training observation and the training action in the experience tuple using a critic neural network to determine a neural network output for the experience tuple, and determining a target neural network output for the experience tuple; updating current values of the parameters of the critic neural network using errors between the target neural network outputs and the neural network outputs; and updating the current values of the parameters of the actor neural network using the critic neural network.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification