-
公开(公告)号:US20180341860A1
公开(公告)日:2018-11-29
申请号:US16021971
申请日:2018-06-28
申请人: Google LLC
发明人: Noam M. Shazeer , Aidan Nicholas Gomez , Lukasz Mieczyslaw Kaiser , Jakob D. Uszkoreit , Llion Owen Jones , Niki J. Parmar , Illia Polosukhin , Ashish Teku Vaswani
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. In one aspect, one of the systems includes an encoder neural network configured to receive the input sequence and generate encoded representations of the network inputs, the encoder neural network comprising a sequence of one or more encoder subnetworks, each encoder subnetwork configured to receive a respective encoder subnetwork input for each of the input positions and to generate a respective subnetwork output for each of the input positions, and each encoder subnetwork comprising: an encoder self-attention sub-layer that is configured to receive the subnetwork input for each of the input positions and, for each particular input position in the input order: apply an attention mechanism over the encoder subnetwork inputs using one or more queries derived from the encoder subnetwork input at the particular input position.
-
公开(公告)号:US11893483B2
公开(公告)日:2024-02-06
申请号:US16988547
申请日:2020-08-07
申请人: Google LLC
发明人: Noam M. Shazeer , Aidan Nicholas Gomez , Lukasz Mieczyslaw Kaiser , Jakob D. Uszkoreit , Llion Owen Jones , Niki J. Parmar , Illia Polosukhin , Ashish Teku Vaswani
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. In one aspect, one of the systems includes an encoder neural network configured to receive the input sequence and generate encoded representations of the network inputs, the encoder neural network comprising a sequence of one or more encoder subnetworks, each encoder subnetwork configured to receive a respective encoder subnetwork input for each of the input positions and to generate a respective subnetwork output for each of the input positions, and each encoder subnetwork comprising: an encoder self-attention sub-layer that is configured to receive the subnetwork input for each of the input positions and, for each particular input position in the input order: apply an attention mechanism over the encoder subnetwork inputs using one or more queries derived from the encoder subnetwork input at the particular input position.
-
公开(公告)号:US11113602B2
公开(公告)日:2021-09-07
申请号:US16932422
申请日:2020-07-17
申请人: Google LLC
发明人: Noam M. Shazeer , Aidan Nicholas Gomez , Lukasz Mieczyslaw Kaiser , Jakob D. Uszkoreit , Llion Owen Jones , Niki J. Parmar , Illia Polosukhin , Ashish Teku Vaswani
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. In one aspect, one of the systems includes an encoder neural network configured to receive the input sequence and generate encoded representations of the network inputs, the encoder neural network comprising a sequence of one or more encoder subnetworks, each encoder subnetwork configured to receive a respective encoder subnetwork input for each of the input positions and to generate a respective subnetwork output for each of the input positions, and each encoder subnetwork comprising: an encoder self-attention sub-layer that is configured to receive the subnetwork input for each of the input positions and, for each particular input position in the input order: apply an attention mechanism over the encoder subnetwork inputs using one or more queries derived from the encoder subnetwork input at the particular input position.
-
公开(公告)号:US10853590B2
公开(公告)日:2020-12-01
申请号:US16688958
申请日:2019-11-19
申请人: Google LLC
IPC分类号: G10L25/30 , G06F40/58 , G06F40/263 , G06N3/04 , G06N3/08
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media for performing machine translation tasks. One method includes receiving an input text segment in an input language; processing the input text segment using an encoder neural network to generate an encoder neural network output, the encoder neural network comprising multiple depth wise separable convolutional neural network layers; processing the encoder neural network output using an autoregressive decoder neural network to generate a decoder neural network output; and processing the decoder neural network output to generate a predicted output text segment in a target natural language.
-
公开(公告)号:US11803711B2
公开(公告)日:2023-10-31
申请号:US17100169
申请日:2020-11-20
申请人: Google LLC
IPC分类号: G10L25/30 , G06F40/58 , G06F40/263 , G06N3/08 , G06N3/045
CPC分类号: G06F40/58 , G06F40/263 , G06N3/045 , G06N3/08 , G10L25/30
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media for performing machine translation tasks. One method includes receiving an input text segment in an input language; processing the input text segment using an encoder neural network to generate an encoder neural network output, the encoder neural network comprising multiple depth wise separable convolutional neural network layers; processing the encoder neural network output using an autoregressive decoder neural network to generate a decoder neural network output; and processing the decoder neural network output to generate a predicted output text segment in a target natural language.
-
公开(公告)号:US11494561B2
公开(公告)日:2022-11-08
申请号:US16984337
申请日:2020-08-04
申请人: Google LLC
发明人: Noam M. Shazeer , Aidan Nicholas Gomez , Lukasz Mieczyslaw Kaiser , Jakob D. Uszkoreit , Llion Owen Jones , Niki J. Parmar , Ashish Teku Vaswani
IPC分类号: G06F40/284 , G06K9/62 , G06N3/04 , G06N3/08
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media for training a machine learning model to perform multiple machine learning tasks from multiple machine learning domains. One system includes a machine learning model that includes multiple input modality neural networks corresponding to respective different modalities and being configured to map received data inputs of the corresponding modality to mapped data inputs from a unified representation space; an encoder neural network configured to process mapped data inputs from the unified representation space to generate respective encoder data outputs; a decoder neural network configured to process encoder data outputs to generate respective decoder data outputs from the unified representation space; and multiple output modality neural networks corresponding to respective different modalities and being configured to map decoder data outputs to data outputs of the corresponding modality.
-
公开(公告)号:US20210073481A1
公开(公告)日:2021-03-11
申请号:US17100169
申请日:2020-11-20
申请人: Google LLC
IPC分类号: G06F40/58 , G06F40/263 , G06N3/04 , G06N3/08
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media for performing machine translation tasks. One method includes receiving an input text segment in an input language; processing the input text segment using an encoder neural network to generate an encoder neural network output, the encoder neural network comprising multiple depth wise separable convolutional neural network layers; processing the encoder neural network output using an autoregressive decoder neural network to generate a decoder neural network output; and processing the decoder neural network output to generate a predicted output text segment in a target natural language.
-
公开(公告)号:US20190392319A1
公开(公告)日:2019-12-26
申请号:US16559392
申请日:2019-09-03
申请人: Google LLC
发明人: Noam M. Shazeer , Aidan Nicholas Gomez , Lukasz Mieczyslaw Kaiser , Jakob D. Uszkoreit , Llion Owen Jones , Niki J. Parmar , Illia Polosukhin , Ashish Teku Vaswani
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. In one aspect, one of the systems includes an encoder neural network configured to receive the input sequence and generate encoded representations of the network inputs, the encoder neural network comprising a sequence of one or more encoder subnetworks, each encoder subnetwork configured to receive a respective encoder subnetwork input for each of the input positions and to generate a respective subnetwork output for each of the input positions, and each encoder subnetwork comprising: an encoder self-attention sub-layer that is configured to receive the subnetwork input for each of the input positions and, for each particular input position in the input order: apply an attention mechanism over the encoder subnetwork inputs using one or more queries derived from the encoder subnetwork input at the particular input position.
-
公开(公告)号:US20240144006A1
公开(公告)日:2024-05-02
申请号:US18407299
申请日:2024-01-08
申请人: Google LLC
发明人: Noam M. Shazeer , Aidan Nicholas Gomez , Lukasz Mieczyslaw Kaiser , Jakob D. Uszkoreit , Llion Owen Jones , Niki J. Parmar , Illia Polosukhin , Ashish Teku Vaswani
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. In one aspect, one of the systems includes an encoder neural network configured to receive the input sequence and generate encoded representations of the network inputs, the encoder neural network comprising a sequence of one or more encoder subnetworks, each encoder subnetwork configured to receive a respective encoder subnetwork input for each of the input positions and to generate a respective subnetwork output for each of the input positions, and each encoder subnetwork comprising: an encoder self-attention sub-layer that is configured to receive the subnetwork input for each of the input positions and, for each particular input position in the input order: apply an attention mechanism over the encoder subnetwork inputs using one or more queries derived from the encoder subnetwork input at the particular input position.
-
公开(公告)号:US10956819B2
公开(公告)日:2021-03-23
申请号:US16988518
申请日:2020-08-07
申请人: Google LLC
发明人: Noam M. Shazeer , Aidan Nicholas Gomez , Lukasz Mieczyslaw Kaiser , Jakob D. Uszkoreit , Llion Owen Jones , Niki J. Parmar , Illia Polosukhin , Ashish Teku Vaswani
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. In one aspect, one of the systems includes an encoder neural network configured to receive the input sequence and generate encoded representations of the network inputs, the encoder neural network comprising a sequence of one or more encoder subnetworks, each encoder subnetwork configured to receive a respective encoder subnetwork input for each of the input positions and to generate a respective subnetwork output for each of the input positions, and each encoder subnetwork comprising: an encoder self-attention sub-layer that is configured to receive the subnetwork input for each of the input positions and, for each particular input position in the input order: apply an attention mechanism over the encoder subnetwork inputs using one or more queries derived from the encoder subnetwork input at the particular input position.
-
-
-
-
-
-
-
-
-