-
公开(公告)号:US20240220796A1
公开(公告)日:2024-07-04
申请号:US18403992
申请日:2024-01-04
Applicant: Google LLC
Inventor: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben David Goodrich , Peter J. Liu , Ryan Sepassi
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
公开(公告)号:US12299573B2
公开(公告)日:2025-05-13
申请号:US18404014
申请日:2024-01-04
Applicant: Google LLC
Inventor: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben David Goodrich , Peter J. Liu , Ryan Sepassi
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
3.
公开(公告)号:US12217180B2
公开(公告)日:2025-02-04
申请号:US18485950
申请日:2023-10-12
Applicant: Google LLC
Inventor: Mohammad Saleh , Jingqing Zhang , Yao Zhao , Peter J. Liu
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a text summarization neural network. One of the methods includes pre-training the text summarization neural network including learning values of a plurality of network parameters through self-supervised learning using unlabeled data comprising unlabeled first texts, the pre-training including: obtaining an unlabeled first text comprising a plurality of segments; selecting one or more of the plurality of segments; processing a masked first text that excludes the one or more selected segments to generate a prediction of the one or more selected segments; and determining, based on a difference between the prediction and the one or more selected segments, an update to the current values of the plurality of network parameters; adapting the pre-trained text summarization neural network for a specific text summarization task using labeled data comprising second texts and respective summaries of the second texts.
-
公开(公告)号:US20240256859A1
公开(公告)日:2024-08-01
申请号:US18403966
申请日:2024-01-04
Applicant: Google LLC
Inventor: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben David Goodrich , Peter J. Liu , Ryan Sepassi
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
公开(公告)号:US11886998B2
公开(公告)日:2024-01-30
申请号:US18096946
申请日:2023-01-13
Applicant: Google LLC
Inventor: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben Goodrich , Peter J. Liu , Ryan Sepassi
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
公开(公告)号:US12271817B2
公开(公告)日:2025-04-08
申请号:US18403966
申请日:2024-01-04
Applicant: Google LLC
Inventor: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben David Goodrich , Peter J. Liu , Ryan Sepassi
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
公开(公告)号:US11556786B2
公开(公告)日:2023-01-17
申请号:US16759690
申请日:2018-10-29
Applicant: GOOGLE LLC
Inventor: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben David Goodrich , Peter J. Liu , Ryan Sepassi
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
8.
公开(公告)号:US20210350229A1
公开(公告)日:2021-11-11
申请号:US17140863
申请日:2021-01-04
Applicant: Google LLC
Inventor: Mohammad Saleh , Jingqing Zhang , Yao Zhao , Peter J. Liu
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a text summarization neural network. One of the methods includes pre-training the text summarization neural network including learning values of a plurality of network parameters through self-supervised learning using unlabeled data comprising unlabeled first texts, the pre-training including: obtaining an unlabeled first text comprising a plurality of segments; selecting one or more of the plurality of segments; processing a masked first text that excludes the one or more selected segments to generate a prediction of the one or more selected segments; and determining, based on a difference between the prediction and the one or more selected segments, an update to the current values of the plurality of network parameters; adapting the pre-trained text summarization neural network for a specific text summarization task using labeled data comprising second texts and respective summaries of the second texts.
-
公开(公告)号:US20240211752A1
公开(公告)日:2024-06-27
申请号:US18404014
申请日:2024-01-04
Applicant: Google LLC
Inventor: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben David Goodrich , Peter J. Liu , Ryan Sepassi
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
公开(公告)号:US20240211751A1
公开(公告)日:2024-06-27
申请号:US18403939
申请日:2024-01-04
Applicant: Google LLC
Inventor: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben David Goodrich , Peter J. Liu , Ryan Sepassi
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
-
-
-
-
-
-
-
-