Patent search ap:("DeepMind Technologies Limited") AND inv:"Marcus Hutter" Page 1

1.

发明申请
GATED LINEAR CONTEXTUAL BANDITS 有权

公开(公告)号：US20230079338A1

公开(公告)日：2023-03-16

申请号：US17766854

申请日：2020-10-08

Applicant: DeepMind Technologies Limited

Inventor： Eren Sezener , Joel William Veness , Marcus Hutter , Jianan Wang , David Budden

IPC: G06N3/00 , G06N3/063

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-readable storage media, for training a neural network to control a real-world agent interacting with a real-world environment to cause the real-world agent to perform a particular task. One of the methods includes training the neural network to determine first values of the parameters by optimizing a first task-specific objective that measures a performance of the policy neural network in controlling a simulated version of the real-world agent; obtaining real-world data generated from interactions of the real-world agent with the real-world environment; and training the neural network to determine trained values of the parameters from the first values of the parameters by jointly optimizing (i) a self-supervised objective that measures at least a performance of internal representations generated by the neural network on a self-supervised task performed on the real-world data and (ii) a second task-specific objective.

2.

发明公开
DETERMINING STATIONARY POINTS OF A LOSS FUNCTION USING CLIPPED AND UNBIASED GRADIENTS 审中-公开

公开(公告)号：US20240256861A1

公开(公告)日：2024-08-01

申请号：US18424545

申请日：2024-01-26

Applicant: DeepMind Technologies Limited

Inventor： Marcus Hutter , Bryn Hayeder Khalid Elesedy

IPC: G06N3/08

CPC classification number: G06N3/08

Abstract: A method of optimizing a loss function defined by one or more numerical parameters is provided. The method comprises determining initial values of the parameters, and performing a plurality of training iterations. Each training iteration except the first comprises (i) determining a gradient of the loss function associated with the parameters, (ii) obtaining a clipped value generated in a previous training iteration, (iii) additively combining the gradient and the clipped value to generate a modified gradient, (iv) processing, using a clipping function based on a threshold value, the modified gradient to generate a clipped gradient, (v) updating the value of the one or more parameters based on the clipped gradient, and (vi) storing, as the clipped value for use in a next training iteration, a difference between the modified gradient and the clipped gradient.

3.

发明公开
EVALUATING REPRESENTATIONS WITH READ-OUT MODEL SWITCHING 审中-公开

公开(公告)号：US20240119302A1

公开(公告)日：2024-04-11

申请号：US18475972

申请日：2023-09-27

Applicant: DeepMind Technologies Limited

Inventor： Yazhe Li , Jorg Bornschein , Marcus Hutter

IPC: G06N3/092

CPC classification number: G06N3/092

Abstract: A method of automatically selecting a neural network from a plurality of computer-implemented candidate neural networks, each candidate neural network comprising at least an encoder neural network trained to encode an input value as a latent representation. The method comprises: obtaining a sequence of data items, each of the data items comprising an input value and a target value; and determining a respective score for each of the candidate neural networks, comprising evaluating the encoder neural network of the candidate neural network using a plurality of read-out heads. Each read-out head comprises parameters for predicting a target value from a latent representation of an input value of a data item encoded using the encoder neural network of the candidate neural network. The method further comprises selecting the neural network from the plurality of candidate neural networks using the respective scores.

Patent Agency Ranking