Patent search ap:("DeepMind Technologies Limited") AND inv:"Koray Kavukcuoglu" Page 4

31.

发明申请
ENVIRONMENT NAVIGATION USING REINFORCEMENT LEARNING 审中-公开

公开(公告)号：US20190266449A1

公开(公告)日：2019-08-29

申请号：US16403343

申请日：2019-05-03

Applicant: DeepMind Technologies Limited

Inventor： Fabio Viola , Piotr Wojciech Mirowski , Andrea Banino , Razvan Pascanu , Hubert Josef Soyer , Andrew James Ballard , Sudarshan Kumaran , Raia Thais Hadsell , Laurent Sifre , Rostislav Goroshin , Koray Kavukcuoglu , Misha Man Ray Denil

IPC: G06K9/62 , G06N3/08 , G06N3/04 , G06K9/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a reinforcement learning system. In one aspect, a method of training an action selection policy neural network for use in selecting actions to be performed by an agent navigating through an environment to accomplish one or more goals comprises: receiving an observation image characterizing a current state of the environment; processing, using the action selection policy neural network, an input comprising the observation image to generate an action selection output; processing, using a geometry-prediction neural network, an intermediate output generated by the action selection policy neural network to predict a value of a feature of a geometry of the environment when in the current state; and backpropagating a gradient of a geometry-based auxiliary loss into the action selection policy neural network to determine a geometry-based auxiliary update for current values of the network parameters.

32.

发明授权
Reinforcement learning using baseline and policy neural networks 有权

公开(公告)号：US12020155B2

公开(公告)日：2024-06-25

申请号：US17733594

申请日：2022-04-29

Applicant: DeepMind Technologies Limited

Inventor： Volodymyr Mnih , Adrià Puigdomènech Badia , Alexander Benjamin Graves , Timothy James Alexander Harley , David Silver , Koray Kavukcuoglu

IPC: G06N3/08 , G06N3/04 , G06N3/045

CPC classification number: G06N3/08 , G06N3/04 , G06N3/045

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for asynchronous deep reinforcement learning. One of the systems includes a plurality of workers, wherein each worker is configured to operate independently of each other worker, and wherein each worker is associated with a respective actor that interacts with a respective replica of the environment during the training of the deep neural network.

33.

发明授权
Using hierarchical representations for neural network architecture searching 有权

公开(公告)号：US11907853B2

公开(公告)日：2024-02-20

申请号：US16759567

申请日：2018-10-26

Applicant: DeepMind Technologies Limited

Inventor： Chrisantha Thomas Fernando , Karen Simonyan , Koray Kavukcuoglu , Hanxiao Liu , Oriol Vinyals

IPC: G06N3/086 , G06N3/045 , G06F17/15 , G06F16/901

CPC classification number: G06N3/086 , G06F16/9024 , G06N3/045 , G06F17/15

Abstract: A computer-implemented method for automatically determining a neural network architecture represents a neural network architecture as a data structure defining a hierarchical set of directed acyclic graphs in multiple levels. Each graph has an input, an output, and a plurality of nodes between the input and the output. At each level, a corresponding set of the nodes are connected pairwise by directed edges which indicate operations performed on outputs of one node to generate an input to another node. Each level is associated with a corresponding set of operations. At a lowest level, the operations associated with each edge are selected from a set of primitive operations. The method includes repeatedly generating new sample neural network architectures, and evaluating their fitness. The modification is performed by selecting a level, selecting two nodes at that level, and modifying, removing or adding an edge between those nodes according to operations associated with lower levels of the hierarchy.

34.

发明授权
Spatial transformer modules 有权

公开(公告)号：US11734572B2

公开(公告)日：2023-08-22

申请号：US16995307

申请日：2020-08-17

Applicant: DeepMind Technologies Limited

Inventor： Maxwell Elliot Jaderberg , Karen Simonyan , Andrew Zisserman , Koray Kavukcuoglu

IPC: G06N3/045 , G06N3/088 , G06N3/084 , G06V10/44

CPC classification number: G06N3/084 , G06N3/045 , G06N3/088 , G06V10/454

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing inputs using an image processing neural network system that includes a spatial transformer module. One of the methods includes receiving an input feature map derived from the one or more input images, and applying a spatial transformation to the input feature map to generate a transformed feature map, comprising: processing the input feature map to generate spatial transformation parameters for the spatial transformation, and sampling from the input feature map in accordance with the spatial transformation parameters to generate the transformed feature map.

35.

发明授权
Distributed training using actor-critic reinforcement learning with off-policy correction factors 有权

公开(公告)号：US11593646B2

公开(公告)日：2023-02-28

申请号：US16767049

申请日：2019-02-05

Applicant: DeepMind Technologies Limited

Inventor： Hubert Josef Soyer , Lasse Espeholt , Karen Simonyan , Yotam Doron , Vlad Firoiu , Volodymyr Mnih , Koray Kavukcuoglu , Remi Munos , Thomas Ward , Timothy James Alexander Harley , Iain Robert Dunning

IPC: G06N3/08 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.

36.

发明授权
Distributed training of reinforcement learning systems 有权

公开(公告)号：US11507827B2

公开(公告)日：2022-11-22

申请号：US16601455

申请日：2019-10-14

Applicant: DeepMind Technologies Limited

Inventor： Praveen Deepak Srinivasan , Rory Fearon , Cagdas Alcicek , Arun Sarath Nair , Samuel Blackwell , Vedavyas Panneershelvam , Alessandro De Maria , Volodymyr Mnih , Koray Kavukcuoglu , David Silver , Mustafa Suleyman

IPC: G06N3/08 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for distributed training of reinforcement learning systems. One of the methods includes receiving, by a learner, current values of the parameters of the Q network from a parameter server, wherein each learner maintains a respective learner Q network replica and a respective target Q network replica; updating, by the learner, the parameters of the learner Q network replica maintained by the learner using the current values; selecting, by the learner, an experience tuple from a respective replay memory; computing, by the learner, a gradient from the experience tuple using the learner Q network replica maintained by the learner and the target Q network replica maintained by the learner; and providing, by the learner, the computed gradient to the parameter server.

37.

发明申请
PROGRESSIVE NEURAL NETWORKS 有权

公开(公告)号：US20210201116A1

公开(公告)日：2021-07-01

申请号：US17201542

申请日：2021-03-15

Applicant: DeepMind Technologies Limited

Inventor： Neil Charles Rabinowitz , Guillaume Desjardins , Andrei-Alexandru Rusu , Koray Kavukcuoglu , Raia Thais Hadsell , Razvan Pascanu , James Kirkpatrick , Hubert Josef Soyer

IPC: G06N3/04 , G06F17/16 , G06N3/08

Abstract: Methods and systems for performing a sequence of machine learning tasks. One system includes a sequence of deep neural networks (DNNs), including: a first DNN corresponding to a first machine learning task, wherein the first DNN comprises a first plurality of indexed layers, and each layer in the first plurality of indexed layers is configured to receive a respective layer input and process the layer input to generate a respective layer output; and one or more subsequent DNNs corresponding to one or more respective machine learning tasks, wherein each subsequent DNN comprises a respective plurality of indexed layers, and each layer in a respective plurality of indexed layers with index greater than one receives input from a preceding layer of the respective subsequent DNN, and one or more preceding layers of respective preceding DNNs, wherein a preceding layer is a layer whose index is one less than the current index.

38.

发明授权
Reinforcement learning using target neural networks 有权

公开(公告)号：US11049008B2

公开(公告)日：2021-06-29

申请号：US15619393

申请日：2017-06-09

Applicant: DeepMind Technologies Limited

Inventor： Volodymyr Mnih , Koray Kavukcuoglu

IPC: G06N3/08 , G06N20/00 , G06N3/04 , A63F13/67

Abstract: We describe a method of reinforcement learning for a subject system having multiple states and actions to move from one state to the next. Training data is generated by operating on the system with a succession of actions and used to train a second neural network. Target values for training the second neural network are derived from a first neural network which is generated by copying weights of the second neural network at intervals.

39.

发明申请
SPATIAL TRANSFORMER MODULES 有权

公开(公告)号：US20210034909A1

公开(公告)日：2021-02-04

申请号：US16995307

申请日：2020-08-17

Applicant: DeepMind Technologies Limited

Inventor： Maxwell Elliot Jaderberg , Karen Simonyan , Andrew Zisserman , Koray Kavukcuoglu

IPC: G06K9/52 , G06N3/04 , G06K9/46 , G06N3/08 , G06K9/03

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing inputs using an image processing neural network system that includes a spatial transformer module. One of the methods includes receiving an input feature map derived from the one or more input images, and applying a spatial transformation to the input feature map to generate a transformed feature map, comprising: processing the input feature map to generate spatial transformation parameters for the spatial transformation, and sampling from the input feature map in accordance with the spatial transformation parameters to generate the transformed feature map.

40.

发明授权
Image processing with recurrent attention 有权

公开(公告)号：US10748041B1

公开(公告)日：2020-08-18

申请号：US16250320

申请日：2019-01-17

Applicant: DeepMind Technologies Limited

Inventor： Volodymyr Mnih , Koray Kavukcuoglu

IPC: G06K9/62 , G06K9/00 , G06K9/46 , G06K9/66

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images using recurrent attention. One of the methods includes determining a location in the first image; extracting a glimpse from the first image using the location; generating a glimpse representation of the extracted glimpse; processing the glimpse representation using a recurrent neural network to update a current internal state of the recurrent neural network to generate a new internal state; processing the new internal state to select a location in a next image in the image sequence after the first image; and processing the new internal state to select an action from a predetermined set of possible actions.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification