GENERATIVE NEURAL NETWORK SYSTEMS FOR GENERATING INSTRUCTION SEQUENCES TO CONTROL AN AGENT PERFORMING A TASK

    公开(公告)号:US20210271968A1

    公开(公告)日:2021-09-02

    申请号:US16967597

    申请日:2019-02-11

    Abstract: A generative adversarial neural network system to provide a sequence of actions for performing a task. The system comprises a reinforcement learning neural network subsystem coupled to a simulator and a discriminator neural network. The reinforcement learning neural network subsystem includes a policy recurrent neural network to, at each of a sequence of time steps, select one or more actions to be performed according to an action selection policy, each action comprising one or more control commands for a simulator. The simulator is configured to implement the control commands for the time steps to generate a simulator output. The discriminator neural network is configured to discriminate between the simulator output and training data, to provide a reward signal for the reinforcement learning. The simulator may be non-differentiable simulator, for example a computer program to produce an image or audio waveform or a program to control a robot or vehicle.

    DISCRETE TOKEN PROCESSING USING DIFFUSION MODELS

    公开(公告)号:US20240119261A1

    公开(公告)日:2024-04-11

    申请号:US18374447

    申请日:2023-09-28

    CPC classification number: G06N3/045

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an output sequence of discrete tokens using a diffusion model. In one aspect, a method includes generating, by using the diffusion model, a final latent representation of the sequence of discrete tokens that includes a determined value for each of a plurality of latent variables; applying a de-embedding matrix to the final latent representation of the output sequence of discrete tokens to generate a de-embedded final latent representation that includes, for each of the plurality of latent variables, a respective numeric score for each discrete token in a vocabulary of multiple discrete tokens; selecting, for each of the plurality of latent variables, a discrete token from among the multiple discrete tokens in the vocabulary that has a highest numeric score; and generating the output sequence of discrete tokens that includes the selected discrete tokens.

Patent Agency Ranking