Systems and Methods for Simulating a Complex Reinforcement Learning Environment

    公开(公告)号:US20200250575A1

    公开(公告)日:2020-08-06

    申请号:US16288279

    申请日:2019-02-28

    Applicant: Google LLC

    Abstract: A computing system for simulating allocation of resources to a plurality of entities is disclosed. The computing system can be configured to input an entity profile that describes a preference and/or demand of a simulated entity into a reinforcement learning agent model and receive, as an output of the reinforcement learning agent model, an allocation output that describes a resource allocation for the simulated entity. The computing system can select one or more resources based on the resource allocation described by the allocation output and provide the resource(s) to an entity model that is configured to simulate a simulated response output that describes a response of the simulated entity. The computing system can receive, as an output of the entity model, the simulated response output and update a resource profile that describes the at least one resource and/or the entity profile based on the simulated response output.

    Systems and Methods for Simulating a Complex Reinforcement Learning Environment

    公开(公告)号:US20230117499A1

    公开(公告)日:2023-04-20

    申请号:US17967595

    申请日:2022-10-17

    Applicant: Google LLC

    Abstract: A computing system for simulating allocation of resources to a plurality of entities is disclosed. The computing system can be configured to input an entity profile that describes a preference and/or demand of a simulated entity into a reinforcement learning agent model and receive, as an output of the reinforcement learning agent model, an allocation output that describes a resource allocation for the simulated entity. The computing system can select one or more resources based on the resource allocation described by the allocation output and provide the resource(s) to an entity model that is configured to simulate a simulated response output that describes a response of the simulated entity. The computing system can receive, as an output of the entity model, the simulated response output and update a resource profile that describes the at least one resource and/or the entity profile based on the simulated response output.

    ANALYZING EMBEDDING SPACES USING LARGE LANGUAGE MODELS

    公开(公告)号:US20250111157A1

    公开(公告)日:2025-04-03

    申请号:US18900500

    申请日:2024-09-27

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for analyzing embedding spaces using large language models. In one aspect, a method performed by one or more computers for analyzing a target embedding space using a neural network configured to perform a set of machine learning tasks is described. The method includes: obtaining, for each of one or more entities, a respective domain embedding representing the entity in the target embedding space; receiving a text prompt including a sequence of input tokens describing a particular machine learning task in the set to be performed on the one or more entities; preparing, for the neural network, an input sequence including each input token in the text prompt and each domain embedding; and processing the input sequence, using the neural network, to generate a sequence of output tokens describing a result of the particular machine learning task.

    Systems and methods for simulating a complex reinforcement learning environment

    公开(公告)号:US11475355B2

    公开(公告)日:2022-10-18

    申请号:US16288279

    申请日:2019-02-28

    Applicant: Google LLC

    Abstract: A computing system for simulating allocation of resources to a plurality of entities is disclosed. The computing system can be configured to input an entity profile that describes a preference and/or demand of a simulated entity into a reinforcement learning agent model and receive, as an output of the reinforcement learning agent model, an allocation output that describes a resource allocation for the simulated entity. The computing system can select one or more resources based on the resource allocation described by the allocation output and provide the resource(s) to an entity model that is configured to simulate a simulated response output that describes a response of the simulated entity. The computing system can receive, as an output of the entity model, the simulated response output and update a resource profile that describes the at least one resource and/or the entity profile based on the simulated response output.

    DETERMINING CONTROL POLICIES BY MINIMIZING THE IMPACT OF DELUSION

    公开(公告)号:US20210383218A1

    公开(公告)日:2021-12-09

    申请号:US17289514

    申请日:2019-10-29

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining a control policy for an agent interacting with an environment. One of the methods includes updating the control policy using policy-consistent backups using Q learning. To determine a policy-consistent backup, the system determining a policy-consistent backup for the control policy at the current observation—current action pair, comprising: for each of a plurality of actions in a set of possible actions that can be performed by the agent, identifying Q values assigned by the control policy to next observation—action pairs by the control policy and justified by at least one of the information sets; pruning, from the identified Q values, any Q values that are justified only by information sets that are not policy-class consistent; and determining, from the reward and only the identified Q values that were not pruned, the policy-consistent backup.

Patent Agency Ranking