Patent search ap:("DeepMind Technologies Limited") AND inv:"Daniel Joseph Strouse" Page 1

1.

发明公开
REINFORCEMENT LEARNING USING AN ENSEMBLE OF DISCRIMINATOR MODELS 审中-公开

公开(公告)号：US20240311639A1

公开(公告)日：2024-09-19

申请号：US18281711

申请日：2022-05-27

Applicant: DeepMind Technologies Limited

Inventor： Steven Stenberg Hansen , Daniel Joseph Strouse

IPC: G06N3/092 , G06N3/045

CPC classification number: G06N3/092 , G06N3/045

Abstract: This specification describes a method performed by one or more data processing apparatus that includes: sampling a latent from a set of possible latents, selecting actions to be performed by an agent to interact with an environment over a sequence of time steps using an action selection neural network that is conditioned on the sampled latent, determining a respective reward received for each time step in the sequence of time steps using an ensemble of discriminator models, and training the action selection neural network based on the rewards using a reinforcement learning technique. Each discriminator model can process an observation to generate a respective prediction output that predicts which latent the action selection neural network was conditioned on to cause the environment to enter the state characterized by the observation.

2.

发明授权
Neural network architecture for efficient resource allocation 有权

公开(公告)号：US11250475B2

公开(公告)日：2022-02-15

申请号：US16918805

申请日：2020-07-01

Applicant: DeepMind Technologies Limited

Inventor： Andrea Tacchetti , Daniel Joseph Strouse , Marta Garnelo Abellanas , Thore Kurt Hartwig Graepel , Yoram Bachrach

IPC: G06Q30/02 , G06N3/04 , G06N3/02 , G10L25/30 , G06K9/28

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for efficiently allocating resources among participants. Methods can include receiving valuation data specifying, for each of a plurality of entities, a respective valuation for each of a plurality of resource subsets, each resource subset comprising a different combination of one or more resources of a plurality of resources. After receiving valuation data, assigning each resource in the plurality of resources to a respective entity of the plurality of entities based on the valuations and generating, for each particular entity, a respective input representation that is derived from valuations of every other entity in the plurality of entities other than the particular entity. The input representation for each particular entity is processed using a neural network to generate a rule for the particular entity and a payment based on the rule output for the entities.

3.

发明申请
NEURAL NETWORK ARCHITECTURE FOR EFFICIENT RESOURCE ALLOCATION 有权

公开(公告)号：US20220005079A1

公开(公告)日：2022-01-06

申请号：US16918805

申请日：2020-07-01

Applicant: DeepMind Technologies Limited

Inventor： Andrea Tacchetti , Daniel Joseph Strouse , Marta Garnelo Abellanas , Thore Kurt Hartwig Graepel , Yoram Bachrach

IPC: G06Q30/02 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for efficiently allocating resources among participants. Methods can include receiving valuation data specifying, for each of a plurality of entities, a respective valuation for each of a plurality of resource subsets, each resource subset comprising a different combination of one or more resources of a plurality of resources. After receiving valuation data, assigning each resource in the plurality of resources to a respective entity of the plurality of entities based on the valuations and generating, for each particular entity, a respective input representation that is derived from valuations of every other entity in the plurality of entities other than the particular entity. The input representation for each particular entity is processed using a neural network to generate a rule for the particular entity and a payment based on the rule output for the entities.

Patent Agency Ranking