-
公开(公告)号:US20230367996A1
公开(公告)日:2023-11-16
申请号:US18044852
申请日:2021-09-23
Applicant: Google LLC
Inventor: Anurag Ajay , Ofir Nachum , Aviral Kumar , Sergey Levine
IPC: G06N3/0455 , G06N3/092
CPC classification number: G06N3/0455 , G06N3/092
Abstract: A method includes determining a first state associated with a particular task, and determining, by a task policy model, a latent space representation of the first state. The task policy model may have been trained to define, for each respective state of a plurality of possible states associated with the particular task, a corresponding latent space representation of the respective state. The method also includes determining, by a primitive policy model and based on the first state and the latent space representation of the first state, an action to take as part of the particular task. The primitive policy model may have been trained to define a space of primitive policies for the plurality of possible states associated with the particular task and a plurality of possible latent space representations. The method further includes executing the action to reach a second state associated with the particular task.