Selecting actions from large discrete action sets using reinforcement learning

    公开(公告)号:US11907837B1

    公开(公告)日:2024-02-20

    申请号:US17131500

    申请日:2020-12-22

    CPC classification number: G06N3/08

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for selecting actions from large discrete action sets. One of the methods includes receiving a particular observation representing a particular state of an environment; and selecting an action from a discrete set of actions to be performed by an agent interacting with the environment, comprising: processing the particular observation using an actor policy network to generate an ideal point; determining, from the points that represent actions in the set, the k nearest points to the ideal point; for each nearest point of the k nearest points: processing the nearest point and the particular observation using a Q network to generate a respective Q value for the action represented by the nearest point; and selecting the action to be performed by the agent from the k actions represented by the k nearest points based on the Q values.

    OPTIMIZING DATA CENTER CONTROLS USING NEURAL NETWORKS

    公开(公告)号:US20210287072A1

    公开(公告)日:2021-09-16

    申请号:US17331614

    申请日:2021-05-26

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for improving operational efficiency within a data center by modeling data center performance and predicting power usage efficiency. An example method receives a state input characterizing a current state of a data center. For each data center setting slate, the state input and the data center setting slate are processed through an ensemble of machine learning models. Each machine learning model is configured to receive and process the state input and the data center setting slate to generate an efficiency score that characterizes a predicted resource efficiency of the data center if the data center settings defined by the data center setting slate are adopted t. The method selects, based on the efficiency scores for the data center setting slates, new values for the data center settings.

    OPTIMIZING DATA CENTER CONTROLS USING NEURAL NETWORKS

    公开(公告)号:US20200272889A1

    公开(公告)日:2020-08-27

    申请号:US16863357

    申请日:2020-04-30

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for improving operational efficiency within a data center by modeling data center performance and predicting power usage efficiency. An example method receives a state input characterizing a current state of a data center. For each data center setting slate, the state input and the data center setting slate are processed through an ensemble of machine learning models. Each machine learning model is configured to receive and process the state input and the data center setting slate to generate an efficiency score that characterizes a predicted resource efficiency of the data center if the data center settings defined by the data center setting slate are adopted t. The method selects, based on the efficiency scores for the data center setting slates, new values for the data center settings.

    Optimizing data center controls using neural networks

    公开(公告)号:US11836599B2

    公开(公告)日:2023-12-05

    申请号:US17331614

    申请日:2021-05-26

    CPC classification number: G06N3/045

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for improving operational efficiency within a data center by modeling data center performance and predicting power usage efficiency. An example method receives a state input characterizing a current state of a data center. For each data center setting slate, the state input and the data center setting slate are processed through an ensemble of machine learning models. Each machine learning model is configured to receive and process the state input and the data center setting slate to generate an efficiency score that characterizes a predicted resource efficiency of the data center if the data center settings defined by the data center setting slate are adopted t. The method selects, based on the efficiency scores for the data center setting slates, new values for the data center settings.

    Selecting actions from large discrete action sets using reinforcement learning

    公开(公告)号:US10885432B1

    公开(公告)日:2021-01-05

    申请号:US15382383

    申请日:2016-12-16

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for selecting actions from large discrete action sets. One of the methods includes receiving a particular observation representing a particular state of an environment; and selecting an action from a discrete set of actions to be performed by an agent interacting with the environment, comprising: processing the particular observation using an actor policy network to generate an ideal point; determining, from the points that represent actions in the set, the k nearest points to the ideal point; for each nearest point of the k nearest points: processing the nearest point and the particular observation using a Q network to generate a respective Q value for the action represented by the nearest point; and selecting the action to be performed by the agent from the k actions represented by the k nearest points based on the Q values.

Patent Agency Ranking