Patent search ap:("Google Inc.") AND inv:"Timothy Paul Lillicrap" Page 1

1.

发明申请
REINFORCEMENT LEARNING USING ADVANTAGE ESTIMATES 审中-公开

公开(公告)号：US20170228662A1

公开(公告)日：2017-08-10

申请号：US15429088

申请日：2017-02-09

Applicant: Google Inc.

Inventor： Shixiang Gu , Timothy Paul Lillicrap , Ilya ISutskever , Sergey Vladimir Levine

IPC: G06N99/00 , G06N7/00

CPC classification number: G06N3/0427 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for computing Q values for actions to be performed by an agent interacting with an environment from a continuous action space of actions. In one aspect, a system includes a value subnetwork configured to receive an observation characterizing a current state of the environment and process the observation to generate a value estimate; a policy subnetwork configured to receive the observation and process the observation to generate an ideal point in the continuous action space; and a subsystem configured to receive a particular point in the continuous action space representing a particular action; generate an advantage estimate for the particular action; and generate a Q value for the particular action that is an estimate of an expected return resulting from the agent performing the particular action when the environment is in the current state.

2.

发明申请
AUGMENTING NEURAL NETWORKS WITH SPARSELY-ACCESSED EXTERNAL MEMORY 审中-公开

公开(公告)号：US20170228638A1

公开(公告)日：2017-08-10

申请号：US15424685

申请日：2017-02-03

Applicant: Google Inc.

Inventor： Ivo Danihelka , Gregory Duncan Wayne , Fu-min Wang , Edward Thomas Grefenstette , Jack William Rae , Alexander Benjamin Graves , Timothy Paul Lillicrap , Timothy James Alexander Harley , Jonathan James Hunt

IPC: G06N3/063 , G06N3/04 , G06N3/08

CPC classification number: G06N3/063 , G06N3/0445 , G06N3/082

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks with an external memory. One of the systems includes a sparse memory access subsystem that is configured to perform operations comprising generating a sparse set of reading weights that includes a respective reading weight for each of the plurality of locations in the external memory using the read key, reading data from the plurality of locations in the external memory in accordance with the sparse set of reading weights, generating a set of writing weights that includes a respective writing weight for each of the plurality of locations in the external memory, and writing the write vector to the plurality of locations in the external memory in accordance with the writing weights.

3.

发明申请
Augmenting Neural Networks with External Memory 审中-公开

公开(公告)号：US20170228637A1

公开(公告)日：2017-08-10

申请号：US15396289

申请日：2016-12-30

Applicant: Google Inc.

Inventor： Adam Anthony Santoro , Daniel Pieter Wierstra , Timothy Paul Lillicrap , Sergey Bartunov , Ivo Danihelka

IPC: G06N3/063 , G06N3/08 , G06N3/04

CPC classification number: G06N3/063 , G06F12/123 , G06N3/04 , G06N3/0445 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks with an external memory. One of the systems includes a controller neural network that includes a Least Recently Used Access (LRUA) subsystem configured to: maintain a respective usage weight for each of a plurality of locations in the external memory, and for each of the plurality of time steps: generate a respective reading weight for each location using a read key, read data from the locations in accordance with the reading weights, generate a respective writing weight for each of the locations from a respective reading weight from a preceding time step and the respective usage weight for the location, write a write vector to the locations in accordance with the writing weights, and update the respective usage weight from the respective reading weight and the respective writing weight.

4.

发明申请
CONTINUOUS CONTROL WITH DEEP REINFORCEMENT LEARNING 审中-公开
Title translation: 连续加强深度控制

公开(公告)号：US20170024643A1

公开(公告)日：2017-01-26

申请号：US15217758

申请日：2016-07-22

Applicant: Google Inc.

Inventor： Timothy Paul Lillicrap , Jonathan James Hunt , Alexander Pritzel , Nicolas Manfred Otto Heess , Tom Erez , Yuval Tassa , David Silver , Daniel Pieter Wierstra

IPC: G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an actor neural network used to select actions to be performed by an agent interacting with an environment. One of the methods includes obtaining a minibatch of experience tuples; and updating current values of the parameters of the actor neural network, comprising: for each experience tuple in the minibatch: processing the training observation and the training action in the experience tuple using a critic neural network to determine a neural network output for the experience tuple, and determining a target neural network output for the experience tuple; updating current values of the parameters of the critic neural network using errors between the target neural network outputs and the neural network outputs; and updating the current values of the parameters of the actor neural network using the critic neural network.

Abstract translation: 方法，系统和装置，包括在计算机存储介质上编码的计算机程序，用于训练用于选择由与环境交互的代理执行的动作的动作者神经网络。其中一种方法包括获取经验元组的小批量; 并且更新所述演员神经网络的参数的当前值，包括：对于所述迷你服务中的每个经验元组：使用批评神经网络来处理训练观察和经验元组中的训练动作以确定体验元组的神经网络输出，并且为所述体验元组确定目标神经网络输出; 使用目标神经网络输出与神经网络输出之间的误差来更新评价神经网络参数的当前值; 并使用批评神经网络更新演员神经网络的参数的当前值。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification