Patent search ap:("DeepMind Technologies Limited") AND inv:"Timothy Paul Lillicrap" Page 3

21.

发明申请
DATA-EFFICIENT REINFORCEMENT LEARNING FOR CONTINUOUS CONTROL TASKS 审中-公开

公开(公告)号：US20200285909A1

公开(公告)日：2020-09-10

申请号：US16882373

申请日：2020-05-22

Applicant: DeepMind Technologies Limited

Inventor： Martin Riedmiller , Roland Hafner , Mel Vecerik , Timothy Paul Lillicrap , Thomas Lampe , Ivaylo Popov , Gabriel Barth-Maron , Nicolas Manfred Otto Heess

IPC: G06K9/62 , G06N3/04 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-efficient reinforcement learning. One of the systems is a system for training an actor neural network used to select actions to be performed by an agent that interacts with an environment by receiving observations characterizing states of the environment and, in response to each observation, performing an action selected from a continuous space of possible actions, wherein the actor neural network maps observations to next actions in accordance with values of parameters of the actor neural network, and wherein the system comprises: a plurality of workers, wherein each worker is configured to operate independently of each other worker, wherein each worker is associated with a respective agent replica that interacts with a respective replica of the environment during the training of the actor neural network.

22.

发明授权
Data-efficient reinforcement learning for continuous control tasks 有权

公开(公告)号：US10664725B2

公开(公告)日：2020-05-26

申请号：US16528260

申请日：2019-07-31

Applicant: DeepMind Technologies Limited

Inventor： Martin Riedmiller , Roland Hafner , Mel Vecerik , Timothy Paul Lillicrap , Thomas Lampe , Ivaylo Popov , Gabriel Barth-Maron , Nicolas Manfred Otto Heess

IPC: G06N3/08 , G06K9/62 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-efficient reinforcement learning. One of the systems is a system for training an actor neural network used to select actions to be performed by an agent that interacts with an environment by receiving observations characterizing states of the environment and, in response to each observation, performing an action selected from a continuous space of possible actions, wherein the actor neural network maps observations to next actions in accordance with values of parameters of the actor neural network, and wherein the system comprises: a plurality of workers, wherein each worker is configured to operate independently of each other worker, wherein each worker is associated with a respective agent replica that interacts with a respective replica of the environment during the training of the actor neural network.

23.

发明申请
SCALABLE AND COMPRESSIVE NEURAL NETWORK DATA STORAGE SYSTEM 有权

公开(公告)号：US20250053780A1

公开(公告)日：2025-02-13

申请号：US18662972

申请日：2024-05-13

Applicant: DeepMind Technologies Limited

Inventor： Jack William Rae , Timothy Paul Lillicrap , Sergey Bartunov

IPC: G06N3/045 , G06F16/22 , G06N3/08

Abstract: A system for compressed data storage using a neural network. The system comprises a memory comprising a plurality of memory locations configured to store data; a query neural network configured to process a representation of an input data item to generate a query; an immutable key data store comprising key data for indexing the plurality of memory locations; an addressing system configured to process the key data and the query to generate a weighting associated with the plurality of memory locations; a memory read system configured to generate output memory data from the memory based upon the generated weighting associated with the plurality of memory locations and the data stored at the plurality of memory locations; and a memory write system configured to write received write data to the memory based upon the generated weighting associated with the plurality of memory locations.

24.

发明授权
Learned computer control using pointing device and keyboard actions 有权

公开(公告)号：US12189870B2

公开(公告)日：2025-01-07

申请号：US18103309

申请日：2023-01-30

Applicant: DeepMind Technologies Limited

Inventor： Peter Conway Humphreys , Timothy Paul Lillicrap , Tobias Markus Pohlen , Adam Anthony Santoro

IPC: G06F3/023 , G06F3/033 , G06F40/284

Abstract: A computer-implemented method for controlling a particular computer to execute a task is described. The method includes receiving a control input comprising a visual input, the visual input including one or more screen frames of a computer display that represent at least a current state of the particular computer; processing the control input using a neural network to generate one or more control outputs that are used to control the particular computer to execute the task, in which the one or more control outputs include an action type output that specifies at least one of a pointing device action or a keyboard action to be performed to control the particular computer; determining one or more actions from the one or more control outputs; and executing the one or more actions to control the particular computer.

25.

发明公开
CONTINUOUS CONTROL WITH DEEP REINFORCEMENT LEARNING 审中-公开

公开(公告)号：US20240177002A1

公开(公告)日：2024-05-30

申请号：US18497931

申请日：2023-10-30

Applicant: DeepMind Technologies Limited

Inventor： Timothy Paul Lillicrap , Jonathan James Hunt , Alexander Pritzel , Nicolas Manfred Otto Heess , Tom Erez , Yuval Tassa , David Silver , Daniel Pieter Wierstra

IPC: G06N3/08 , G06N3/006 , G06N3/045 , G06N3/084

CPC classification number: G06N3/08 , G06N3/006 , G06N3/045 , G06N3/084

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an actor neural network used to select actions to be performed by an agent interacting with an environment. One of the methods includes obtaining a minibatch of experience tuples; and updating current values of the parameters of the actor neural network, comprising: for each experience tuple in the minibatch: processing the training observation and the training action in the experience tuple using a critic neural network to determine a neural network output for the experience tuple, and determining a target neural network output for the experience tuple; updating current values of the parameters of the critic neural network using errors between the target neural network outputs and the neural network outputs; and updating the current values of the parameters of the actor neural network using the critic neural network.

26.

发明授权
Controlling agents over long time scales using temporal value transport 有权

公开(公告)号：US11769049B2

公开(公告)日：2023-09-26

申请号：US17035546

申请日：2020-09-28

Applicant: DeepMind Technologies Limited

Inventor： Gregory Duncan Wayne , Timothy Paul Lillicrap , Chia-Chun Hung , Joshua Simon Abramson

IPC: G06K9/62 , G06F11/30 , G06N3/08 , G06F18/21 , G06V10/764 , G06V10/774 , G06V10/778 , G06V10/82

CPC classification number: G06N3/08 , G06F11/3037 , G06F11/3072 , G06F18/2193 , G06V10/764 , G06V10/774 , G06V10/7796 , G06V10/82

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network system used to control an agent interacting with an environment to perform a specified task. One of the methods includes causing the agent to perform a task episode in which the agent attempts to perform the specified task; for each of one or more particular time steps in the sequence: generating a modified reward for the particular time step from (i) the actual reward at the time step and (ii) value predictions at one or more time steps that are more than a threshold number of time steps after the particular time step in the sequence; and training, through reinforcement learning, the neural network system using at least the modified rewards for the particular time steps.

27.

发明公开
LEARNED COMPUTER CONTROL USING POINTING DEVICE AND KEYBOARD ACTIONS 审中-公开

公开(公告)号：US20230244325A1

公开(公告)日：2023-08-03

申请号：US18103309

申请日：2023-01-30

Applicant: DeepMind Technologies Limited

Inventor： Peter Conway Humphreys , Timothy Paul Lillicrap , Tobias Markus Pohlen , Adam Anthony Santoro

IPC: G06F3/033 , G06F3/023 , G06F40/284

CPC classification number: G06F3/033 , G06F3/023 , G06F40/284

Abstract: A computer-implemented method for controlling a particular computer to execute a task is described. The method includes receiving a control input comprising a visual input, the visual input including one or more screen frames of a computer display that represent at least a current state of the particular computer; processing the control input using a neural network to generate one or more control outputs that are used to control the particular computer to execute the task, in which the one or more control outputs include an action type output that specifies at least one of a pointing device action or a keyboard action to be performed to control the particular computer; determining one or more actions from the one or more control outputs; and executing the one or more actions to control the particular computer.

28.

发明公开
CONTROLLING INTERACTIVE AGENTS USING MULTI-MODAL INPUTS 审中-公开

公开(公告)号：US20230178076A1

公开(公告)日：2023-06-08

申请号：US18077194

申请日：2022-12-07

Applicant: DeepMind Technologies Limited

Inventor： Joshua Simon Abramson , Arun Ahuja , Federico Javier Carnevale , Petko Ivanov Georgiev , Chia-Chun Hung , Timothy Paul Lillicrap , Alistair Michael Muldal , Adam Anthony Santoro , Tamara Louise von Glehn , Jessica Paige Landon , Gregory Duncan Wayne , Chen Yan , Rui Zhu

IPC: G10L15/22 , G10L15/16 , G10L13/02 , G06V10/82 , G06V20/50 , G06F40/284 , G06F40/40 , G06V10/774 , G10L15/06

CPC classification number: G10L15/22 , G10L15/16 , G10L13/02 , G06V10/82 , G06V20/50 , G06F40/284 , G06F40/40 , G06V10/774 , G10L15/063 , G10L2015/223

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling agents. In particular, an interactive agent can be controlled based on multi-modal inputs that include both an observation image and a natural language text sequence.

29.

发明授权
Selecting reinforcement learning actions using a low-level controller 有权

公开(公告)号：US11210585B1

公开(公告)日：2021-12-28

申请号：US15594228

申请日：2017-05-12

Applicant: DeepMind Technologies Limited

Inventor： Nicolas Manfred Otto Heess , Timothy Paul Lillicrap , Gregory Duncan Wayne , Yuval Tassa

IPC: G06N3/08 , G06N3/00

Abstract: Methods, systems, and apparatus for selecting actions to be performed by an agent interacting with an environment. One system includes a high-level controller neural network, low-level controller network, and subsystem. The high-level controller neural network receives an input observation and processes the input observation to generate a high-level output defining a control signal for the low-level controller. The low-level controller neural network receives a designated component of an input observation and processes the designated component and an input control signal to generate a low-level output that defines an action to be performed by the agent in response to the input observation. The subsystem receives a current observation characterizing a current state of the environment, determines whether criteria are satisfied for generating a new control signal, and based on the determination, provides appropriate inputs to the high-level and low-level controllers for selecting an action to be performed by the agent.

30.

发明授权
Augmenting neural networks with sparsely-accessed external memory 有权

公开(公告)号：US11151443B2

公开(公告)日：2021-10-19

申请号：US15424685

申请日：2017-02-03

Applicant: DeepMind Technologies Limited

Inventor： Ivo Danihelka , Gregory Duncan Wayne , Fu-min Wang , Edward Thomas Grefenstette , Jack William Rae , Alexander Benjamin Graves , Timothy Paul Lillicrap , Timothy James Alexander Harley , Jonathan James Hunt

IPC: G06N3/063 , G06N3/04 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks with an external memory. One of the systems includes a sparse memory access subsystem that is configured to perform operations comprising generating a sparse set of reading weights that includes a respective reading weight for each of the plurality of locations in the external memory using the read key, reading data from the plurality of locations in the external memory in accordance with the sparse set of reading weights, generating a set of writing weights that includes a respective writing weight for each of the plurality of locations in the external memory, and writing the write vector to the plurality of locations in the external memory in accordance with the writing weights.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification