LARGE SCALE RETRIEVAL FOR SEQUENCE GENERATION

    公开(公告)号:US20230177334A1

    公开(公告)日:2023-06-08

    申请号:US18076984

    申请日:2022-12-07

    CPC classification number: G06N3/08

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a final output sequence. In one aspect, a method comprises: receiving a current output sequence comprising one or more current output segments; receiving a set of reference segments and a respective reference segment embedding of each reference segment that has been generated using an embedding neural network; for each current output segment: processing the current output segment using the embedding neural network to generate a current output segment embedding of the current output segment; and selecting k most similar reference segments to the current output segment using the reference segment embeddings and the current output segment embedding; and processing the current output sequence and the k most similar reference segments for each current output segment to generate an additional output segment that follows the current output sequence in the final output sequence.

    DISCRETE TOKEN PROCESSING USING DIFFUSION MODELS

    公开(公告)号:US20240119261A1

    公开(公告)日:2024-04-11

    申请号:US18374447

    申请日:2023-09-28

    CPC classification number: G06N3/045

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an output sequence of discrete tokens using a diffusion model. In one aspect, a method includes generating, by using the diffusion model, a final latent representation of the sequence of discrete tokens that includes a determined value for each of a plurality of latent variables; applying a de-embedding matrix to the final latent representation of the output sequence of discrete tokens to generate a de-embedded final latent representation that includes, for each of the plurality of latent variables, a respective numeric score for each discrete token in a vocabulary of multiple discrete tokens; selecting, for each of the plurality of latent variables, a discrete token from among the multiple discrete tokens in the vocabulary that has a highest numeric score; and generating the output sequence of discrete tokens that includes the selected discrete tokens.

    TRAINING CONDITIONAL COMPUTATION NEURAL NETWORKS USING REINFORCEMENT LEARNING

    公开(公告)号:US20230177309A1

    公开(公告)日:2023-06-08

    申请号:US18076978

    申请日:2022-12-07

    CPC classification number: G06N3/0427

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network having one or more conditional computation layers, where each conditional computation layer includes a gating sub-layer having multiple gating parameters and an expert sub-layer having multiple expert neural networks. In one aspect, a method comprises: sampling a batch of target output sequences that comprises a respective ground truth output token at each of multiple output positions; for each target output sequence, processing the target output sequence using the neural network to generate a network output that includes respective score distributions over the vocabulary of output tokens for the output positions in the target output sequence; and training each gating sub-layer using respective rewards for the gating sub-layer for the output positions through reinforcement learning to optimize a reinforcement learning objective function that measures an expected reward received by the gating sub-layer.

Patent Agency Ranking