Patent search ap:("DEEPMIND TECHNOLOGIES LIMITED") AND inv:"Timothy Arthur Mann" Page 1

1.

发明申请
LOW LATENCY MULTI-CONSTRAINT RANKING OF CONTENT ITEMS 有权

公开(公告)号：US20210256072A1

公开(公告)日：2021-08-19

申请号：US17177097

申请日：2021-02-16

Applicant: DeepMind Technologies Limited

Inventor： Timothy Arthur Mann , Ivan Lobov , Anton Zhernov , Krishnamurthy Dvijotham , Xiaohong Gong , Dan-Andrei Calian

IPC: G06F16/903 , G06F17/16 , G06F17/11

Abstract: Methods and systems for low-latency multi-constraint ranking of content items. One of the methods includes receiving a request to rank a plurality of content items for presentation to a user to maximize a primary objective subject to a plurality of constraints; initializing a dual variable vector; updating the dual variable vector, comprising: determining an overall objective score for the dual variable vector; identifying a plurality of candidate dual variable vectors that includes one or more neighboring node dual variable vectors; determining respective overall objective scores for each of the one or more candidate dual variable vectors; identifying the candidate with the best overall objective score; and determining whether to update the dual variable vector based on whether the identified candidate has a better overall objective score than the dual variable vector; and determining a final ranking for the content items based on the dual variable vector.

2.

发明申请
ROBUST REINFORCEMENT LEARNING FOR CONTINUOUS CONTROL WITH MODEL MISSPECIFICATION 有权

公开(公告)号：US20220343157A1

公开(公告)日：2022-10-27

申请号：US17620164

申请日：2020-06-17

Applicant: DEEPMIND TECHNOLOGIES LIMITED

Inventor： Daniel J. Mankowitz , Nir Levine , Rae Chan Jeong , Abbas Abdolmaleki , Jost Tobias Springenberg , Todd Andrew Hester , Timothy Arthur Mann , Martin Riedmiller

IPC: G06N3/08 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a policy neural network having policy parameters. One of the methods includes sampling a mini-batch comprising one or more observation-action-reward tuples generated as a result of interactions of a first agent with a first environment; determining an update to current values of the Q network parameters by minimizing a robust entropy-regularized temporal difference (TD) error that accounts for possible perturbations of the states of the first environment represented by the observations in the observation-action-reward tuples; and determining, using the Q-value neural network, an update to the policy network parameters using the sampled mini-batch of observation-action-reward tuples.

3.

发明申请
NON-STATIONARY DELAYED BANDITS WITH INTERMEDIATE SIGNALS 有权

公开(公告)号：US20210158196A1

公开(公告)日：2021-05-27

申请号：US17103843

申请日：2020-11-24

Applicant: DeepMind Technologies Limited

Inventor： Claire Vernade , András György , Timothy Arthur Mann

IPC: G06N7/00 , G06N7/08 , G06N20/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, of selecting actions from a set of actions to be performed in an environment. One of the methods includes, at each time step: maintaining count data; determining, for each action, a respective current transition probability distribution that includes a respective current transition probability for each of the intermediate signals that represents an estimate of a current likelihood that the intermediate signal will be observed if the action is performed; determining, for each intermediate signal, a respective reward estimate that is an estimate of a reward that will be received as a result of the intermediate signal being observed; determining, from the respective current transition probability distributions and the respective reward estimates, a respective action score for each action; and selecting an action to be performed based on the respective action scores.

4.

发明授权
Training a neural network using outputs of a corruption neural network 有权

公开(公告)号：US12254678B2

公开(公告)日：2025-03-18

申请号：US17711951

申请日：2022-04-01

Applicant: DeepMind Technologies Limited

Inventor： Dan-Andrei Calian , Sven Adrian Gowal , Timothy Arthur Mann , András György

IPC: G06V10/774 , G06V10/776 , G06V10/82

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for processing a network input using a trained neural network with network parameters to generate an output for a machine learning task. The training includes: receiving a set of training examples each including a training network input and a reference output; for each training iteration, generating a corrupted network input for each training network input using a corruption neural network; updating perturbation parameters of the corruption neural network using a first objective function based on the corrupted network inputs; generating an updated corrupted network input for each training network input based on the updated perturbation parameters; and generating a network output for each updated corrupted network input using the neural network; for each training example, updating the network parameters using a second objective function based on the network output and the reference output.

5.

发明公开
LEARNING FROM DELAYED OUTCOMES USING NEURAL NETWORKS 审中-公开

公开(公告)号：US20230244912A1

公开(公告)日：2023-08-03

申请号：US18131580

申请日：2023-04-06

Applicant: DeepMind Technologies Limited

Inventor： Huiyi Hu , Ray Jiang , Timothy Arthur Mann , Sven Adrian Gowal , Balaji Lakshminarayanan , András György

IPC: G06N3/045 , G06N3/08 , G06N3/047

CPC classification number: G06N3/045 , G06N3/08 , G06N3/047

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for learning from delayed outcomes using neural networks. One of the methods includes receiving an input observation; generating, from the input observation, an output label distribution over possible labels for the input observation at a final time, comprising: processing the input observation using a first neural network configured to process the input observation to generate a distribution over possible values for an intermediate indicator at a first time earlier than the final time; generating, from the distribution, an input value for the intermediate indicator; and processing the input value for the intermediate indicator using a second neural network configured to process the input value for the intermediate indicator to determine the output label distribution over possible values for the input observation at the final time; and providing an output derived from the output label distribution.

6.

发明授权
Learning from delayed outcomes using neural networks 有权

公开(公告)号：US12124938B2

公开(公告)日：2024-10-22

申请号：US18131580

申请日：2023-04-06

Applicant: DeepMind Technologies Limited

Inventor： Huiyi Hu , Ray Jiang , Timothy Arthur Mann , Sven Adrian Gowal , Balaji Lakshminarayanan , András György

IPC: G06N3/045 , G06N3/047 , G06N3/08

CPC classification number: G06N3/045 , G06N3/047 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for learning from delayed outcomes using neural networks. One of the methods includes receiving an input observation; generating, from the input observation, an output label distribution over possible labels for the input observation at a final time, comprising: processing the input observation using a first neural network configured to process the input observation to generate a distribution over possible values for an intermediate indicator at a first time earlier than the final time; generating, from the distribution, an input value for the intermediate indicator; and processing the input value for the intermediate indicator using a second neural network configured to process the input value for the intermediate indicator to determine the output label distribution over possible values for the input observation at the final time; and providing an output derived from the output label distribution.

7.

发明授权
Learning from delayed outcomes using neural networks 有权

公开(公告)号：US11714994B2

公开(公告)日：2023-08-01

申请号：US16298448

申请日：2019-03-11

Applicant: DeepMind Technologies Limited

Inventor： Huiyi Hu , Ray Jiang , Timothy Arthur Mann , Sven Adrian Gowal , Balaji Lakshminarayanan , András György

IPC: G06N3/04 , G06N3/08

CPC classification number: G06N3/0454 , G06N3/0472 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for learning from delayed outcomes using neural networks. One of the methods includes receiving an input observation; generating, from the input observation, an output label distribution over possible labels for the input observation at a final time, comprising: processing the input observation using a first neural network configured to process the input observation to generate a distribution over possible values for an intermediate indicator at a first time earlier than the final time; generating, from the distribution, an input value for the intermediate indicator; and processing the input value for the intermediate indicator using a second neural network configured to process the input value for the intermediate indicator to determine the output label distribution over possible values for the input observation at the final time; and providing an output derived from the output label distribution.

8.

发明授权
Low latency multi-constraint ranking of content items 有权

公开(公告)号：US12001484B2

公开(公告)日：2024-06-04

申请号：US17177097

申请日：2021-02-16

Applicant: DeepMind Technologies Limited

Inventor： Timothy Arthur Mann , Ivan Lobov , Anton Zhernov , Krishnamurthy Dvijotham , Xiaohong Gong , Dan-Andrei Calian

IPC: G06F16/95 , G06F16/903 , G06F17/11 , G06F17/16

CPC classification number: G06F16/90335 , G06F17/11 , G06F17/16

Abstract: Methods and systems for low-latency multi-constraint ranking of content items. One of the methods includes receiving a request to rank a plurality of content items for presentation to a user to maximize a primary objective subject to a plurality of constraints; initializing a dual variable vector; updating the dual variable vector, comprising: determining an overall objective score for the dual variable vector; identifying a plurality of candidate dual variable vectors that includes one or more neighboring node dual variable vectors; determining respective overall objective scores for each of the one or more candidate dual variable vectors; identifying the candidate with the best overall objective score; and determining whether to update the dual variable vector based on whether the identified candidate has a better overall objective score than the dual variable vector; and determining a final ranking for the content items based on the dual variable vector.

9.

发明公开
TRAINING NEURAL NETWORKS 审中-公开

公开(公告)号：US20230316729A1

公开(公告)日：2023-10-05

申请号：US17711951

申请日：2022-04-01

Applicant: DeepMind Technologies Limited

Inventor： Dan-Andrei Calian , Sven Adrian Gowal , Timothy Arthur Mann , András György

IPC: G06V10/774 , G06V10/82 , G06V10/776

CPC classification number: G06V10/7747 , G06V10/82 , G06V10/776

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for processing a network input using a trained neural network with network parameters to generate an output for a machine learning task. The training includes: receiving a set of training examples each including a training network input and a reference output; for each training iteration, generating a corrupted network input for each training network input using a corruption neural network; updating perturbation parameters of the corruption neural network using a first objective function based on the corrupted network inputs; generating an updated corrupted network input for each training network input based on the updated perturbation parameters; and generating a network output for each updated corrupted network input using the neural network; for each training example, updating the network parameters using a second objective function based on the network output and the reference output.

10.

发明申请
LEARNING FROM DELAYED OUTCOMES USING NEURAL NETWORKS 审中-公开

公开(公告)号：US20190279076A1

公开(公告)日：2019-09-12

申请号：US16298448

申请日：2019-03-11

Applicant: DeepMind Technologies Limited

Inventor： Huiyi Hu , Ray Jiang , Timothy Arthur Mann , Sven Adrian Gowal , Balaji Lakshminarayanan , Andras Gyorgy

IPC: G06N3/04 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for learning from delayed outcomes using neural networks. One of the methods includes receiving an input observation; generating, from the input observation, an output label distribution over possible labels for the input observation at a final time, comprising: processing the input observation using a first neural network configured to process the input observation to generate a distribution over possible values for an intermediate indicator at a first time earlier than the final time; generating, from the distribution, an input value for the intermediate indicator; and processing the input value for the intermediate indicator using a second neural network configured to process the input value for the intermediate indicator to determine the output label distribution over possible values for the input observation at the final time; and providing an output derived from the output label distribution.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification