Patent search ap:("DeepMind Technologies Limited") AND inv:"Gabriel Dulac-Arnold" Page 1

1.

发明授权
Selecting actions from large discrete action sets using reinforcement learning 有权

公开(公告)号：US11907837B1

公开(公告)日：2024-02-20

申请号：US17131500

申请日：2020-12-22

Applicant: DeepMind Technologies Limited

Inventor： Gabriel Dulac-Arnold , Richard Andrew Evans , Benjamin Kenneth Coppin

IPC: G06N3/08

CPC classification number: G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for selecting actions from large discrete action sets. One of the methods includes receiving a particular observation representing a particular state of an environment; and selecting an action from a discrete set of actions to be performed by an agent interacting with the environment, comprising: processing the particular observation using an actor policy network to generate an ideal point; determining, from the points that represent actions in the set, the k nearest points to the ideal point; for each nearest point of the k nearest points: processing the nearest point and the particular observation using a Q network to generate a respective Q value for the action represented by the nearest point; and selecting the action to be performed by the agent from the k actions represented by the k nearest points based on the Q values.

2.

发明申请
OPTIMIZING DATA CENTER CONTROLS USING NEURAL NETWORKS 有权

公开(公告)号：US20210287072A1

公开(公告)日：2021-09-16

申请号：US17331614

申请日：2021-05-26

Applicant: DeepMind Technologies Limited

Inventor： Richard Andrew Evans , Jim Gao , Michael C. Ryan , Gabriel Dulac-Arnold , Jonathan Karl Scholz , Todd Andrew Hester

IPC: G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for improving operational efficiency within a data center by modeling data center performance and predicting power usage efficiency. An example method receives a state input characterizing a current state of a data center. For each data center setting slate, the state input and the data center setting slate are processed through an ensemble of machine learning models. Each machine learning model is configured to receive and process the state input and the data center setting slate to generate an efficiency score that characterizes a predicted resource efficiency of the data center if the data center settings defined by the data center setting slate are adopted t. The method selects, based on the efficiency scores for the data center setting slates, new values for the data center settings.

3.

发明申请
OPTIMIZING DATA CENTER CONTROLS USING NEURAL NETWORKS 审中-公开

公开(公告)号：US20200272889A1

公开(公告)日：2020-08-27

申请号：US16863357

申请日：2020-04-30

Applicant: DeepMind Technologies Limited

Inventor： Richard Andrew Evans , Jim Gao , Michael C. Ryan , Gabriel Dulac-Arnold , Jonathan Karl Scholz , Todd Andrew Hester

IPC: G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for improving operational efficiency within a data center by modeling data center performance and predicting power usage efficiency. An example method receives a state input characterizing a current state of a data center. For each data center setting slate, the state input and the data center setting slate are processed through an ensemble of machine learning models. Each machine learning model is configured to receive and process the state input and the data center setting slate to generate an efficiency score that characterizes a predicted resource efficiency of the data center if the data center settings defined by the data center setting slate are adopted t. The method selects, based on the efficiency scores for the data center setting slates, new values for the data center settings.

4.

发明授权
Optimizing data center controls using neural networks 有权

公开(公告)号：US11836599B2

公开(公告)日：2023-12-05

申请号：US17331614

申请日：2021-05-26

Applicant: DeepMind Technologies Limited

Inventor： Richard Andrew Evans , Jim Gao , Michael C. Ryan , Gabriel Dulac-Arnold , Jonathan Karl Scholz , Todd Andrew Hester

IPC: G06N3/04 , G06N3/045

CPC classification number: G06N3/045

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for improving operational efficiency within a data center by modeling data center performance and predicting power usage efficiency. An example method receives a state input characterizing a current state of a data center. For each data center setting slate, the state input and the data center setting slate are processed through an ensemble of machine learning models. Each machine learning model is configured to receive and process the state input and the data center setting slate to generate an efficiency score that characterizes a predicted resource efficiency of the data center if the data center settings defined by the data center setting slate are adopted t. The method selects, based on the efficiency scores for the data center setting slates, new values for the data center settings.

5.

发明授权
Selecting actions from large discrete action sets using reinforcement learning 有权

公开(公告)号：US10885432B1

公开(公告)日：2021-01-05

申请号：US15382383

申请日：2016-12-16

Applicant: DeepMind Technologies Limited

Inventor： Gabriel Dulac-Arnold , Richard Andrew Evans , Benjamin Kenneth Coppin

IPC: G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for selecting actions from large discrete action sets. One of the methods includes receiving a particular observation representing a particular state of an environment; and selecting an action from a discrete set of actions to be performed by an agent interacting with the environment, comprising: processing the particular observation using an actor policy network to generate an ideal point; determining, from the points that represent actions in the set, the k nearest points to the ideal point; for each nearest point of the k nearest points: processing the nearest point and the particular observation using a Q network to generate a respective Q value for the action represented by the nearest point; and selecting the action to be performed by the agent from the k actions represented by the k nearest points based on the Q values.

Patent Agency Ranking