-
公开(公告)号:US20200250575A1
公开(公告)日:2020-08-06
申请号:US16288279
申请日:2019-02-28
Applicant: Google LLC
Inventor: Tze Way Eugene Ie , Sanmit Santosh Narvekar , Craig Edgar Boutilier
Abstract: A computing system for simulating allocation of resources to a plurality of entities is disclosed. The computing system can be configured to input an entity profile that describes a preference and/or demand of a simulated entity into a reinforcement learning agent model and receive, as an output of the reinforcement learning agent model, an allocation output that describes a resource allocation for the simulated entity. The computing system can select one or more resources based on the resource allocation described by the allocation output and provide the resource(s) to an entity model that is configured to simulate a simulated response output that describes a response of the simulated entity. The computing system can receive, as an output of the entity model, the simulated response output and update a resource profile that describes the at least one resource and/or the entity profile based on the simulated response output.
-
公开(公告)号:US20230117499A1
公开(公告)日:2023-04-20
申请号:US17967595
申请日:2022-10-17
Applicant: Google LLC
Inventor: Tze Way Eugene Ie , Sanmit Santosh Narvekar , Craig Edgar Boutilier
Abstract: A computing system for simulating allocation of resources to a plurality of entities is disclosed. The computing system can be configured to input an entity profile that describes a preference and/or demand of a simulated entity into a reinforcement learning agent model and receive, as an output of the reinforcement learning agent model, an allocation output that describes a resource allocation for the simulated entity. The computing system can select one or more resources based on the resource allocation described by the allocation output and provide the resource(s) to an entity model that is configured to simulate a simulated response output that describes a response of the simulated entity. The computing system can receive, as an output of the entity model, the simulated response output and update a resource profile that describes the at least one resource and/or the entity profile based on the simulated response output.
-
公开(公告)号:US20250111157A1
公开(公告)日:2025-04-03
申请号:US18900500
申请日:2024-09-27
Applicant: Google LLC
Inventor: Guy Tennenholtz , Yinlam Chow , Chih-wei Hsu , Jihwan Jeong , Lior Shani , Deepak Ramachandran , Martin Mirolyubov Mladenov , Craig Edgar Boutilier
IPC: G06F40/284 , G06F40/40 , G06N3/0455
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for analyzing embedding spaces using large language models. In one aspect, a method performed by one or more computers for analyzing a target embedding space using a neural network configured to perform a set of machine learning tasks is described. The method includes: obtaining, for each of one or more entities, a respective domain embedding representing the entity in the target embedding space; receiving a text prompt including a sequence of input tokens describing a particular machine learning task in the set to be performed on the one or more entities; preparing, for the neural network, an input sequence including each input token in the text prompt and each domain embedding; and processing the input sequence, using the neural network, to generate a sequence of output tokens describing a result of the particular machine learning task.
-
公开(公告)号:US11475355B2
公开(公告)日:2022-10-18
申请号:US16288279
申请日:2019-02-28
Applicant: Google LLC
Inventor: Tze Way Eugene Ie , Sanmit Santosh Narvekar , Craig Edgar Boutilier
Abstract: A computing system for simulating allocation of resources to a plurality of entities is disclosed. The computing system can be configured to input an entity profile that describes a preference and/or demand of a simulated entity into a reinforcement learning agent model and receive, as an output of the reinforcement learning agent model, an allocation output that describes a resource allocation for the simulated entity. The computing system can select one or more resources based on the resource allocation described by the allocation output and provide the resource(s) to an entity model that is configured to simulate a simulated response output that describes a response of the simulated entity. The computing system can receive, as an output of the entity model, the simulated response output and update a resource profile that describes the at least one resource and/or the entity profile based on the simulated response output.
-
公开(公告)号:US20210383218A1
公开(公告)日:2021-12-09
申请号:US17289514
申请日:2019-10-29
Applicant: Google LLC
Inventor: Tian Lu , Dale Eric Schuurmans , Craig Edgar Boutilier
IPC: G06N3/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining a control policy for an agent interacting with an environment. One of the methods includes updating the control policy using policy-consistent backups using Q learning. To determine a policy-consistent backup, the system determining a policy-consistent backup for the control policy at the current observation—current action pair, comprising: for each of a plurality of actions in a set of possible actions that can be performed by the agent, identifying Q values assigned by the control policy to next observation—action pairs by the control policy and justified by at least one of the information sets; pruning, from the identified Q values, any Q values that are justified only by information sets that are not policy-class consistent; and determining, from the reward and only the identified Q values that were not pruned, the policy-consistent backup.
-
-
-
-