-
公开(公告)号:US20240220796A1
公开(公告)日:2024-07-04
申请号:US18403992
申请日:2024-01-04
Applicant: Google LLC
Inventor: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben David Goodrich , Peter J. Liu , Ryan Sepassi
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
公开(公告)号:US20190332919A1
公开(公告)日:2019-10-31
申请号:US16504924
申请日:2019-07-08
Applicant: Google LLC
Inventor: Ron J. Weiss , Thang Minh Luong , Peter J. Liu , Colin Abraham Raffel , Douglas Eck
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a target sequence including a respective output at each of multiple output time steps from respective encoded representations of inputs in an input sequence. The method includes, for each output time step, starting from the position, in the input order, of the encoded representation that was selected as a preceding context vector at a preceding output time step, traversing the encoded representations until an encoded representation is selected as a current context vector at the output time step. A decoder neural network processes the current context vector and a preceding output at the preceding output time step to generate a respective output score for each possible output and to update the hidden state of the decoder recurrent neural network. An output is selected for the output time step using the output scores.
-
公开(公告)号:US12271817B2
公开(公告)日:2025-04-08
申请号:US18403966
申请日:2024-01-04
Applicant: Google LLC
Inventor: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben David Goodrich , Peter J. Liu , Ryan Sepassi
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
公开(公告)号:US11556786B2
公开(公告)日:2023-01-17
申请号:US16759690
申请日:2018-10-29
Applicant: GOOGLE LLC
Inventor: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben David Goodrich , Peter J. Liu , Ryan Sepassi
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
5.
公开(公告)号:US20210350229A1
公开(公告)日:2021-11-11
申请号:US17140863
申请日:2021-01-04
Applicant: Google LLC
Inventor: Mohammad Saleh , Jingqing Zhang , Yao Zhao , Peter J. Liu
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a text summarization neural network. One of the methods includes pre-training the text summarization neural network including learning values of a plurality of network parameters through self-supervised learning using unlabeled data comprising unlabeled first texts, the pre-training including: obtaining an unlabeled first text comprising a plurality of segments; selecting one or more of the plurality of segments; processing a masked first text that excludes the one or more selected segments to generate a prediction of the one or more selected segments; and determining, based on a difference between the prediction and the one or more selected segments, an update to the current values of the plurality of network parameters; adapting the pre-trained text summarization neural network for a specific text summarization task using labeled data comprising second texts and respective summaries of the second texts.
-
公开(公告)号:US12299572B2
公开(公告)日:2025-05-13
申请号:US18403939
申请日:2024-01-04
Applicant: Google LLC
Inventor: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben David Goodrich , Peter J. Liu , Ryan Sepassi
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
7.
公开(公告)号:US11803751B2
公开(公告)日:2023-10-31
申请号:US17140863
申请日:2021-01-04
Applicant: Google LLC
Inventor: Mohammad Saleh , Jingqing Zhang , Yao Zhao , Peter J. Liu
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a text summarization neural network. One of the methods includes pre-training the text summarization neural network including learning values of a plurality of network parameters through self-supervised learning using unlabeled data comprising unlabeled first texts, the pre-training including: obtaining an unlabeled first text comprising a plurality of segments; selecting one or more of the plurality of segments; processing a masked first text that excludes the one or more selected segments to generate a prediction of the one or more selected segments; and determining, based on a difference between the prediction and the one or more selected segments, an update to the current values of the plurality of network parameters; adapting the pre-trained text summarization neural network for a specific text summarization task using labeled data comprising second texts and respective summaries of the second texts.
-
公开(公告)号:US20200342316A1
公开(公告)日:2020-10-29
申请号:US16759690
申请日:2018-10-29
Applicant: GOOGLE LLC
Inventor: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben David Goodrich , Peter J. Liu , Ryan Sepassi
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
公开(公告)号:US12299573B2
公开(公告)日:2025-05-13
申请号:US18404014
申请日:2024-01-04
Applicant: Google LLC
Inventor: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben David Goodrich , Peter J. Liu , Ryan Sepassi
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
10.
公开(公告)号:US12217180B2
公开(公告)日:2025-02-04
申请号:US18485950
申请日:2023-10-12
Applicant: Google LLC
Inventor: Mohammad Saleh , Jingqing Zhang , Yao Zhao , Peter J. Liu
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a text summarization neural network. One of the methods includes pre-training the text summarization neural network including learning values of a plurality of network parameters through self-supervised learning using unlabeled data comprising unlabeled first texts, the pre-training including: obtaining an unlabeled first text comprising a plurality of segments; selecting one or more of the plurality of segments; processing a masked first text that excludes the one or more selected segments to generate a prediction of the one or more selected segments; and determining, based on a difference between the prediction and the one or more selected segments, an update to the current values of the plurality of network parameters; adapting the pre-trained text summarization neural network for a specific text summarization task using labeled data comprising second texts and respective summaries of the second texts.
-
-
-
-
-
-
-
-
-