Invention Grant
- Patent Title: Controlling agents over long time scales using temporal value transport
-
Application No.: US16601324Application Date: 2019-10-14
-
Publication No.: US10789511B2Publication Date: 2020-09-29
- Inventor: Gregory Duncan Wayne , Timothy Paul Lillicrap , Chia-Chun Hung , Joshua Simon Abramson
- Applicant: DeepMind Technologies Limited
- Applicant Address: GB London
- Assignee: DeepMind Technologies Limited
- Current Assignee: DeepMind Technologies Limited
- Current Assignee Address: GB London
- Agency: Fish & Richardson P.C.
- Main IPC: G06K9/62
- IPC: G06K9/62 ; G06F11/30 ; G06N3/08

Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network system used to control an agent interacting with an environment to perform a specified task. One of the methods includes causing the agent to perform a task episode in which the agent attempts to perform the specified task; for each of one or more particular time steps in the sequence: generating a modified reward for the particular time step from (i) the actual reward at the time step and (ii) value predictions at one or more time steps that are more than a threshold number of time steps after the particular time step in the sequence; and training, through reinforcement learning, the neural network system using at least the modified rewards for the particular time steps.
Public/Granted literature
- US20200117956A1 CONTROLLING AGENTS OVER LONG TIME SCALES USING TEMPORAL VALUE TRANSPORT Public/Granted day:2020-04-16
Information query