MEMORY AUGMENTED GENERATIVE TEMPORAL MODELS

    公开(公告)号:US20210089968A1

    公开(公告)日:2021-03-25

    申请号:US17113669

    申请日:2020-12-07

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating sequences of predicted observations, for example images. In one aspect, a system comprises a controller recurrent neural network, and a decoder neural network to process a set of latent variables to generate an observation. An external memory and a memory interface subsystem is configured to, for each of a plurality of time steps, receive an updated hidden state from the controller, generate a memory context vector by reading data from the external memory using the updated hidden state, determine a set of latent variables from the memory context vector, generate a predicted observation by providing the set of latent variables to the decoder neural network, write data to the external memory using the latent variables, the updated hidden state, or both, and generate a controller input for a subsequent time step from the latent variables.

    CONTROLLING AGENTS OVER LONG TIME SCALES USING TEMPORAL VALUE TRANSPORT

    公开(公告)号:US20210081723A1

    公开(公告)日:2021-03-18

    申请号:US17035546

    申请日:2020-09-28

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network system used to control an agent interacting with an environment to perform a specified task. One of the methods includes causing the agent to perform a task episode in which the agent attempts to perform the specified task; for each of one or more particular time steps in the sequence: generating a modified reward for the particular time step from (i) the actual reward at the time step and (ii) value predictions at one or more time steps that are more than a threshold number of time steps after the particular time step in the sequence; and training, through reinforcement learning, the neural network system using at least the modified rewards for the particular time steps.

    Controlling agents over long time scales using temporal value transport

    公开(公告)号:US10789511B2

    公开(公告)日:2020-09-29

    申请号:US16601324

    申请日:2019-10-14

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network system used to control an agent interacting with an environment to perform a specified task. One of the methods includes causing the agent to perform a task episode in which the agent attempts to perform the specified task; for each of one or more particular time steps in the sequence: generating a modified reward for the particular time step from (i) the actual reward at the time step and (ii) value predictions at one or more time steps that are more than a threshold number of time steps after the particular time step in the sequence; and training, through reinforcement learning, the neural network system using at least the modified rewards for the particular time steps.

    CONTROLLING AGENTS OVER LONG TIME SCALES USING TEMPORAL VALUE TRANSPORT

    公开(公告)号:US20200117956A1

    公开(公告)日:2020-04-16

    申请号:US16601324

    申请日:2019-10-14

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network system used to control an agent interacting with an environment to perform a specified task. One of the methods includes causing the agent to perform a task episode in which the agent attempts to perform the specified task; for each of one or more particular time steps in the sequence: generating a modified reward for the particular time step from (i) the actual reward at the time step and (ii) value predictions at one or more time steps that are more than a threshold number of time steps after the particular time step in the sequence; and training, through reinforcement learning, the neural network system using at least the modified rewards for the particular time steps.

Patent Agency Ranking