ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS

    公开(公告)号:US20240220796A1

    公开(公告)日:2024-07-04

    申请号:US18403992

    申请日:2024-01-04

    Applicant: Google LLC

    CPC classification number: G06N3/08 G06N3/045

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.

    SEQUENCE PROCESSING USING ONLINE ATTENTION
    2.
    发明申请

    公开(公告)号:US20190332919A1

    公开(公告)日:2019-10-31

    申请号:US16504924

    申请日:2019-07-08

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a target sequence including a respective output at each of multiple output time steps from respective encoded representations of inputs in an input sequence. The method includes, for each output time step, starting from the position, in the input order, of the encoded representation that was selected as a preceding context vector at a preceding output time step, traversing the encoded representations until an encoded representation is selected as a current context vector at the output time step. A decoder neural network processes the current context vector and a preceding output at the preceding output time step to generate a respective output score for each possible output and to update the hidden state of the decoder recurrent neural network. An output is selected for the output time step using the output scores.

    Attention-based decoder-only sequence transduction neural networks

    公开(公告)号:US12271817B2

    公开(公告)日:2025-04-08

    申请号:US18403966

    申请日:2024-01-04

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.

    Attention-based decoder-only sequence transduction neural networks

    公开(公告)号:US11556786B2

    公开(公告)日:2023-01-17

    申请号:US16759690

    申请日:2018-10-29

    Applicant: GOOGLE LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.

    TRAINING TEXT SUMMARIZATION NEURAL NETWORKS WITH AN EXTRACTED SEGMENTS PREDICTION OBJECTIVE

    公开(公告)号:US20210350229A1

    公开(公告)日:2021-11-11

    申请号:US17140863

    申请日:2021-01-04

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a text summarization neural network. One of the methods includes pre-training the text summarization neural network including learning values of a plurality of network parameters through self-supervised learning using unlabeled data comprising unlabeled first texts, the pre-training including: obtaining an unlabeled first text comprising a plurality of segments; selecting one or more of the plurality of segments; processing a masked first text that excludes the one or more selected segments to generate a prediction of the one or more selected segments; and determining, based on a difference between the prediction and the one or more selected segments, an update to the current values of the plurality of network parameters; adapting the pre-trained text summarization neural network for a specific text summarization task using labeled data comprising second texts and respective summaries of the second texts.

    Attention-based decoder-only sequence transduction neural networks

    公开(公告)号:US12299572B2

    公开(公告)日:2025-05-13

    申请号:US18403939

    申请日:2024-01-04

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.

    Training text summarization neural networks with an extracted segments prediction objective

    公开(公告)号:US11803751B2

    公开(公告)日:2023-10-31

    申请号:US17140863

    申请日:2021-01-04

    Applicant: Google LLC

    CPC classification number: G06N3/08 G06F40/30 G06N3/045

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a text summarization neural network. One of the methods includes pre-training the text summarization neural network including learning values of a plurality of network parameters through self-supervised learning using unlabeled data comprising unlabeled first texts, the pre-training including: obtaining an unlabeled first text comprising a plurality of segments; selecting one or more of the plurality of segments; processing a masked first text that excludes the one or more selected segments to generate a prediction of the one or more selected segments; and determining, based on a difference between the prediction and the one or more selected segments, an update to the current values of the plurality of network parameters; adapting the pre-trained text summarization neural network for a specific text summarization task using labeled data comprising second texts and respective summaries of the second texts.

    ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS

    公开(公告)号:US20200342316A1

    公开(公告)日:2020-10-29

    申请号:US16759690

    申请日:2018-10-29

    Applicant: GOOGLE LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.

    Attention-based decoder-only sequence transduction neural networks

    公开(公告)号:US12299573B2

    公开(公告)日:2025-05-13

    申请号:US18404014

    申请日:2024-01-04

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.

    Training text summarization neural networks with an extracted segments prediction objective

    公开(公告)号:US12217180B2

    公开(公告)日:2025-02-04

    申请号:US18485950

    申请日:2023-10-12

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a text summarization neural network. One of the methods includes pre-training the text summarization neural network including learning values of a plurality of network parameters through self-supervised learning using unlabeled data comprising unlabeled first texts, the pre-training including: obtaining an unlabeled first text comprising a plurality of segments; selecting one or more of the plurality of segments; processing a masked first text that excludes the one or more selected segments to generate a prediction of the one or more selected segments; and determining, based on a difference between the prediction and the one or more selected segments, an update to the current values of the plurality of network parameters; adapting the pre-trained text summarization neural network for a specific text summarization task using labeled data comprising second texts and respective summaries of the second texts.

Patent Agency Ranking