Patent search ap:("DEEPMIND TECHNOLOGIES LIMITED") AND inv:"David William Saxton" Page 1

1.

发明申请
PROGRAMMABLE REINFORCEMENT LEARNING SYSTEMS 审中-公开

公开(公告)号：US20200167633A1

公开(公告)日：2020-05-28

申请号：US16615061

申请日：2018-05-22

Applicant: DEEPMIND TECHNOLOGIES LIMITED

Inventor： Misha Man Ray Denil , Sergio Gomez Colmenarejo , Serkan Cabi , David William Saxton , Joao Ferdinando Gomes de Freitas

IPC: G06N3/04 , G06N3/08 , G06K9/62

Abstract: A reinforcement learning system is proposed comprising a plurality of property detector neural networks. Each property detector neural network is arranged to receive data representing an object within an environment, and to generate property data associated with a property of the object. A processor is arranged to receive an instruction indicating a task associated with an object having an associated property, and process the output of the plurality of property detector neural networks based upon the instruction to generate a relevance data item. The relevance data item indicates objects within the environment associated with the task. The processor also generates a plurality of weights based upon the relevance data item, and, based on the weights, generates modified data representing the plurality of objects within the environment. A neural network is arranged to receive the modified data and to output an action associated with the task.

2.

发明申请
PROGRAMMABLE REINFORCEMENT LEARNING SYSTEMS 有权

公开(公告)号：US20240394504A1

公开(公告)日：2024-11-28

申请号：US18637279

申请日：2024-04-16

Applicant: DeepMind Technologies Limited

Inventor： Misha Man Ray Denil , Sergio Gomez Colmenarejo , Serkan Cabi , David William Saxton , Joao Ferdinando Gomes de Freitas

IPC: G06N3/006 , G06F18/21 , G06F18/2451 , G06N3/045 , G06N3/047 , G06N3/084

Abstract: A reinforcement learning system is proposed comprising a plurality of property detector neural networks. Each property detector neural network is arranged to receive data representing an object within an environment, and to generate property data associated with a property of the object. A processor is arranged to receive an instruction indicating a task associated with an object having an associated property, and process the output of the plurality of property detector neural networks based upon the instruction to generate a relevance data item. The relevance data item indicates objects within the environment associated with the task. The processor also generates a plurality of weights based upon the relevance data item, and, based on the weights, generates modified data representing the plurality of objects within the environment. A neural network is arranged to receive the modified data and to output an action associated with the task.

3.

发明申请
NEURAL NETWORK OPTIMIZATION USING CURVATURE ESTIMATES BASED ON RECENT GRADIENTS 有权

公开(公告)号：US20210383222A1

公开(公告)日：2021-12-09

申请号：US17337820

申请日：2021-06-03

Applicant: DeepMind Technologies Limited

Inventor： David William Saxton , Eshaan Nichani

IPC: G06N3/08 , G06F17/18

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network by estimating the objective function curvature based on current and previous gradients. In one aspect, a method comprises: sampling a batch of training data; and for each neural network parameter: determining, based on the current batch of training data, a respective current gradient of the objective function at the current iteration with respect to the current neural network parameter; estimating an objective function curvature with respect to the current neural network parameter based on (i) the current gradient of the objective function at the current iteration, and (ii) a respective previous gradient of the objective function at each of a plurality of previous iterations; and updating a current value of the neural network parameter based on the estimate of the curvature of the objective function.

Patent Agency Ranking