Patent search ap:("DeepMind Technologies Limited") AND inv:"Jost Tobias Springenberg" Page 1

1.

发明申请
ROBUST REINFORCEMENT LEARNING FOR CONTINUOUS CONTROL WITH MODEL MISSPECIFICATION 有权

公开(公告)号：US20220343157A1

公开(公告)日：2022-10-27

申请号：US17620164

申请日：2020-06-17

Applicant: DEEPMIND TECHNOLOGIES LIMITED

Inventor： Daniel J. Mankowitz , Nir Levine , Rae Chan Jeong , Abbas Abdolmaleki , Jost Tobias Springenberg , Todd Andrew Hester , Timothy Arthur Mann , Martin Riedmiller

IPC: G06N3/08 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a policy neural network having policy parameters. One of the methods includes sampling a mini-batch comprising one or more observation-action-reward tuples generated as a result of interactions of a first agent with a first environment; determining an update to current values of the Q network parameters by minimizing a robust entropy-regularized temporal difference (TD) error that accounts for possible perturbations of the states of the first environment represented by the observations in the observation-action-reward tuples; and determining, using the Q-value neural network, an update to the policy network parameters using the sampled mini-batch of observation-action-reward tuples.

2.

发明公开
PLANNING USING A JUMPY TRAJECTORY DECODER NEURAL NETWORK 审中-公开

公开(公告)号：US20240220795A1

公开(公告)日：2024-07-04

申请号：US18401226

申请日：2023-12-29

Applicant: DeepMind Technologies Limited

Inventor： Jingwei Zhang , Arunkumar Byravan , Jost Tobias Springenberg , Martin Riedmiller , Nicolas Manfred Otto Heess , Leonard Hasenclever , Abbas Abdolmaleki , Dushyant Rao

IPC: G06N3/08

CPC classification number: G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling agents using jumpy trajectory decoder neural networks.

3.

发明公开
TRAINING AN ACTION SELECTION SYSTEM USING RELATIVE ENTROPY Q-LEARNING 审中-公开

公开(公告)号：US20230214649A1

公开(公告)日：2023-07-06

申请号：US18008838

申请日：2021-07-27

Applicant: DeepMind Technologies Limited

Inventor： Rae Chan Jeong , Jost Tobias Springenberg , Jacqueline Ok-chan Kay , Daniel Hai Huan Zheng , Alexandre Galashov , Nicolas Manfred Otto Heess , Francesco Nori

IPC: G06N3/08

CPC classification number: G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection system using reinforcement learning techniques. In one aspect, a method comprises at each of multiple iterations: obtaining a batch of experience, each experience tuple comprising: a first observation, an action, a second observation, and a reward; for each experience tuple, determining a state value for the second observation, comprising: processing the first observation using a policy neural network to generate an action score for each action in a set of possible actions; sampling multiple actions from the set of possible actions in accordance with the action scores; processing the second observation using a Q neural network to generate a Q value for each sampled action; and determining the state value for the second observation; and determining an update to current values of the Q neural network parameters using the state values.

4.

发明申请
HIERARCHICAL POLICIES FOR MULTITASK TRANSFER 有权

公开(公告)号：US20220237488A1

公开(公告)日：2022-07-28

申请号：US17613687

申请日：2020-05-22

Applicant: DeepMind Technologies Limited

Inventor： Markus Wulfmeier , Abbas Abdolmaleki , Roland Hafner , Jost Tobias Springenberg , Nicolas Manfred Otto Heess , Martin Riedmiller

IPC: G06N7/00 , G06N3/04 , G06N20/20

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for controlling an agent. One of the methods includes obtaining an observation characterizing a current state of the environment and data identifying a task currently being performed by the agent; processing the observation and the data identifying the task using a high-level controller to generate a high-level probability distribution that assigns a respective probability to each of a plurality of low-level controllers; processing the observation using each of the plurality of low-level controllers to generate, for each of the plurality of low-level controllers, a respective low-level probability distribution; generating a combined probability distribution; and selecting, using the combined probability distribution, an action from the space of possible actions to be performed by the agent in response to the observation.

5.

发明申请
GRAPH NEURAL NETWORKS REPRESENTING PHYSICAL SYSTEMS 有权

公开(公告)号：US20210049467A1

公开(公告)日：2021-02-18

申请号：US17046963

申请日：2019-04-12

Applicant: DeepMind Technologies Limited

Inventor： Martin Riedmiller , Raia Thais Hadsell , Peter William Battaglia , Joshua Merel , Jost Tobias Springenberg , Alvaro Sanchez , Nicolas Manfred Otto Heess

IPC: G06N3/08

Abstract: A graph neural network system implementing a learnable physics engine for understanding and controlling a physical system. The physical system is considered to be composed of bodies coupled by joints and is represented by static and dynamic graphs. A graph processing neural network processes an input graph e.g. the static and dynamic graphs, to provide an output graph, e.g. a predicted dynamic graph. The graph processing neural network is differentiable and may be used for control and/or reinforcement learning. The trained graph neural network system can be applied to physical systems with similar but new graph structures (zero-shot learning).

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification