HIERARCHICAL POLICIES FOR MULTITASK TRANSFER

    公开(公告)号:US20220237488A1

    公开(公告)日:2022-07-28

    申请号:US17613687

    申请日:2020-05-22

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for controlling an agent. One of the methods includes obtaining an observation characterizing a current state of the environment and data identifying a task currently being performed by the agent; processing the observation and the data identifying the task using a high-level controller to generate a high-level probability distribution that assigns a respective probability to each of a plurality of low-level controllers; processing the observation using each of the plurality of low-level controllers to generate, for each of the plurality of low-level controllers, a respective low-level probability distribution; generating a combined probability distribution; and selecting, using the combined probability distribution, an action from the space of possible actions to be performed by the agent in response to the observation.

    TRAINING ACTION SELECTION NEURAL NETWORKS
    46.
    发明申请

    公开(公告)号:US20190258918A1

    公开(公告)日:2019-08-22

    申请号:US16402687

    申请日:2019-05-03

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network. One of the methods includes maintaining a replay memory that stores trajectories generated as a result of interaction of an agent with an environment; and training an action selection neural network having policy parameters on the trajectories in the replay memory, wherein training the action selection neural network comprises: sampling a trajectory from the replay memory; and adjusting current values of the policy parameters by training the action selection neural network on the trajectory using an off-policy actor critic reinforcement learning technique.

Patent Agency Ranking