Controlling agents over long time scales using temporal value transport

    公开(公告)号:US10789511B2

    公开(公告)日:2020-09-29

    申请号:US16601324

    申请日:2019-10-14

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network system used to control an agent interacting with an environment to perform a specified task. One of the methods includes causing the agent to perform a task episode in which the agent attempts to perform the specified task; for each of one or more particular time steps in the sequence: generating a modified reward for the particular time step from (i) the actual reward at the time step and (ii) value predictions at one or more time steps that are more than a threshold number of time steps after the particular time step in the sequence; and training, through reinforcement learning, the neural network system using at least the modified rewards for the particular time steps.

    CONTROLLING AGENTS OVER LONG TIME SCALES USING TEMPORAL VALUE TRANSPORT

    公开(公告)号:US20210081723A1

    公开(公告)日:2021-03-18

    申请号:US17035546

    申请日:2020-09-28

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network system used to control an agent interacting with an environment to perform a specified task. One of the methods includes causing the agent to perform a task episode in which the agent attempts to perform the specified task; for each of one or more particular time steps in the sequence: generating a modified reward for the particular time step from (i) the actual reward at the time step and (ii) value predictions at one or more time steps that are more than a threshold number of time steps after the particular time step in the sequence; and training, through reinforcement learning, the neural network system using at least the modified rewards for the particular time steps.

    CONTROLLING AGENTS OVER LONG TIME SCALES USING TEMPORAL VALUE TRANSPORT

    公开(公告)号:US20200117956A1

    公开(公告)日:2020-04-16

    申请号:US16601324

    申请日:2019-10-14

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network system used to control an agent interacting with an environment to perform a specified task. One of the methods includes causing the agent to perform a task episode in which the agent attempts to perform the specified task; for each of one or more particular time steps in the sequence: generating a modified reward for the particular time step from (i) the actual reward at the time step and (ii) value predictions at one or more time steps that are more than a threshold number of time steps after the particular time step in the sequence; and training, through reinforcement learning, the neural network system using at least the modified rewards for the particular time steps.

Patent Agency Ranking