Patent search ap:("DeepMind Technologies Limited") AND inv:"Joel William Veness" Page 1

1.

发明授权
Gated linear networks 有权

公开(公告)号：US11842264B2

公开(公告)日：2023-12-12

申请号：US16759993

申请日：2018-11-30

Applicant: DeepMind Technologies Limited

Inventor： Agnieszka Grabska-Barwinska , Peter Toth , Christopher Mattern , Avishkar Bhoopchand , Tor Lattimore , Joel William Veness

IPC: G06N3/063 , G06N3/047 , G06N3/048 , G06N7/01

CPC classification number: G06N3/063 , G06N3/047 , G06N3/048 , G06N7/01

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for a neural network system comprising one or more gated linear networks. A system includes: one or more gated linear networks, wherein each gated linear network corresponds to a respective data value in an output data sample and is configured to generate a network probability output that defines a probability distribution over possible values for the corresponding data value, wherein each gated linear network comprises a plurality of layers, wherein the plurality of layers comprises a plurality of gated linear layers, wherein each gated linear layer has one or more nodes, and wherein each node is configured to: receive a plurality of inputs, receive side information for the node; combine the plurality of inputs according to a set of weights defined by the side information, and generate and output a node probability output for the corresponding data value.

2.

发明申请
GATED LINEAR CONTEXTUAL BANDITS 有权

公开(公告)号：US20230079338A1

公开(公告)日：2023-03-16

申请号：US17766854

申请日：2020-10-08

Applicant: DeepMind Technologies Limited

Inventor： Eren Sezener , Joel William Veness , Marcus Hutter , Jianan Wang , David Budden

IPC: G06N3/00 , G06N3/063

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-readable storage media, for training a neural network to control a real-world agent interacting with a real-world environment to cause the real-world agent to perform a particular task. One of the methods includes training the neural network to determine first values of the parameters by optimizing a first task-specific objective that measures a performance of the policy neural network in controlling a simulated version of the real-world agent; obtaining real-world data generated from interactions of the real-world agent with the real-world environment; and training the neural network to determine trained values of the parameters from the first values of the parameters by jointly optimizing (i) a self-supervised objective that measures at least a performance of internal representations generated by the neural network on a self-supervised task performed on the real-world data and (ii) a second task-specific objective.

3.

发明申请
GATED LINEAR NETWORKS 审中-公开

公开(公告)号：US20200349418A1

公开(公告)日：2020-11-05

申请号：US16759993

申请日：2018-11-30

Applicant: DeepMind Technologies Limited

Inventor： Agnieszka Grabska-Barwinska , Peter Toth , Christopher Mattern , Avishkar Bhoopchand , Tor Lattimore , Joel William Veness

IPC: G06N3/063 , G06N3/04 , G06N7/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for a neural network system comprising one or more gated linear networks. A system includes: one or more gated linear networks, wherein each gated linear network corresponds to a respective data value in an output data sample and is configured to generate a network probability output that defines a probability distribution over possible values for the corresponding data value, wherein each gated linear network comprises a plurality of layers, wherein the plurality of layers comprises a plurality of gated linear layers, wherein each gated linear layer has one or more nodes, and wherein each node is configured to: receive a plurality of inputs, receive side information for the node; combine the plurality of inputs according to a set of weights defined by the side information, and generate and output a node probability output for the corresponding data value.

4.

发明授权
Evaluating reinforcement learning policies 有权

公开(公告)号：US10445653B1

公开(公告)日：2019-10-15

申请号：US14821549

申请日：2015-08-07

Applicant: DeepMind Technologies Limited

Inventor： Joel William Veness , Marc Gendron-Bellemare

IPC: G06N20/00 , G06N5/02

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for evaluating reinforcement learning policies. One of the methods includes receiving a plurality of training histories for a reinforcement learning agent; determining a total reward for each training observation in the training histories; partitioning the training observations into a plurality of partitions; determining, for each partition and from the partitioned training observations, a probability that the reinforcement learning agent will receive the total reward for the partition if the reinforcement learning agent performs the action for the partition in response to receiving the current observation; determining, from the probabilities and for each total reward, a respective estimated value of performing each action in response to receiving the current observation; and selecting an action from the pre-determined set of actions from the estimated values in accordance with an action selection policy.

5.

发明公开
GATED LINEAR NETWORKS 审中-公开

公开(公告)号：US20240202511A1

公开(公告)日：2024-06-20

申请号：US18536127

申请日：2023-12-11

Applicant: DeepMind Technologies Limited

Inventor： Agnieszka Grabska-Barwinska , Peter Toth , Christopher Mattern , Avishkar Bhoopchand , Tor Lattimore , Joel William Veness

IPC: G06N3/063 , G06N3/047 , G06N3/048 , G06N7/01

CPC classification number: G06N3/063 , G06N3/047 , G06N3/048 , G06N7/01

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for a neural network system comprising one or more gated linear networks. A system includes: one or more gated linear networks, wherein each gated linear network corresponds to a respective data value in an output data sample and is configured to generate a network probability output that defines a probability distribution over possible values for the corresponding data value, wherein each gated linear network comprises a plurality of layers, wherein the plurality of layers comprises a plurality of gated linear layers, wherein each gated linear layer has one or more nodes, and wherein each node is configured to: receive a plurality of inputs, receive side information for the node; combine the plurality of inputs according to a set of weights defined by the side information, and generate and output a node probability output for the corresponding data value.

6.

发明授权
Evaluating reinforcement learning policies 有权

公开(公告)号：US11429898B1

公开(公告)日：2022-08-30

申请号：US16601547

申请日：2019-10-14

Applicant: DeepMind Technologies Limited

Inventor： Joel William Veness , Marc Gendron-Bellemare

IPC: G06N20/00 , G06N5/02

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for evaluating reinforcement learning policies. One of the methods includes receiving a plurality of training histories for a reinforcement learning agent; determining a total reward for each training observation in the training histories; partitioning the training observations into a plurality of partitions; determining, for each partition and from the partitioned training observations, a probability that the reinforcement learning agent will receive the total reward for the partition if the reinforcement learning agent performs the action for the partition in response to receiving the current observation; determining, from the probabilities and for each total reward, a respective estimated value of performing each action in response to receiving the current observation; and selecting an action from the pre-determined set of actions from the estimated values in accordance with an action selection policy.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification