-
公开(公告)号:US12032920B2
公开(公告)日:2024-07-09
申请号:US17056554
申请日:2020-03-07
Applicant: Google LLC
Inventor: Ye Jia , Zhifeng Chen , Yonghui Wu , Melvin Johnson , Fadi Biadsy , Ron Weiss , Wolfgang Macherey
Abstract: The present disclosure provides systems and methods that train and use machine-learned models such as, for example, sequence-to-sequence models, to perform direct and text-free speech-to-speech translation. In particular, aspects of the present disclosure provide an attention-based sequence-to-sequence neural network which can directly translate speech from one language into speech in another language, without relying on an intermediate text representation.
-
公开(公告)号:US11138392B2
公开(公告)日:2021-10-05
申请号:US16521780
申请日:2019-07-25
Applicant: Google LLC
Inventor: Zhifeng Chen , Macduff Richard Hughes , Yonghui Wu , Michael Schuster , Xu Chen , Llion Owen Jones , Niki J. Parmar , George Foster , Orhan Firat , Ankur Bapna , Wolfgang Macherey , Melvin Jose Johnson Premkumar
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for machine translation using neural networks. In some implementations, a text in one language is translated into a second language using a neural network model. The model can include an encoder neural network comprising a plurality of bidirectional recurrent neural network layers. The encoding vectors are processed using a multi-headed attention module configured to generate multiple attention context vectors for each encoding vector. A decoder neural network generates a sequence of decoder output vectors using the attention context vectors. The decoder output vectors can represent distributions over various language elements of the second language, allowing a translation of the text into the second language to be determined based on the sequence of decoder output vectors.
-
公开(公告)号:US20200034436A1
公开(公告)日:2020-01-30
申请号:US16521780
申请日:2019-07-25
Applicant: Google LLC
Inventor: Zhifeng Chen , Macduff Richard Hughes , Yonghui Wu , Michael Schuster , Xu Chen , Llion Owen Jones , Niki J. Parmar , George Foster , Orhan Firat , Ankur Bapna , Wolfgang Macherey , Melvin Jose Johnson Premkumar
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for machine translation using neural networks. In some implementations, a text in one language is translated into a second language using a neural network model. The model can include an encoder neural network comprising a plurality of bidirectional recurrent neural network layers. The encoding vectors are processed using a multi-headed attention module configured to generate multiple attention context vectors for each encoding vector. A decoder neural network generates a sequence of decoder output vectors using the attention context vectors. The decoder output vectors can represent distributions over various language elements of the second language, allowing a translation of the text into the second language to be determined based on the sequence of decoder output vectors.
-
公开(公告)号:US11562152B2
公开(公告)日:2023-01-24
申请号:US17030093
申请日:2020-09-23
Applicant: Google LLC
Inventor: Naveen Arivazhagan , Colin Andrew Cherry , Wolfgang Macherey , Te I , George Foster , Pallavi N Baljekar
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for re-translation for simultaneous, spoken-language machine translation. In some implementations, a stream of audio data comprising speech in a first language is received. A transcription for the speech in the stream of audio data is generated using an automated speech recognizer through a series of updates. A translation of the transcription into a second language is generated using a machine translation module. The translation is generated with translation iterations that translate increasing amounts of the transcription, including re-translating previously portions of the transcription. A series of translation updates are provided to a client device based on the translation iterations.
-
公开(公告)号:US20220092274A1
公开(公告)日:2022-03-24
申请号:US17030093
申请日:2020-09-23
Applicant: Google LLC
Inventor: Naveen Arivazhagan , Colin Andrew Cherry , Wolfgang Macherey , Te I , George Foster , Pallavi N. Baljekar
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for re-translation for simultaneous, spoken-language machine translation. In some implementations, a stream of audio data comprising speech in a first language is received. A transcription for the speech in the stream of audio data is generated using an automated speech recognizer through a series of updates. A translation of the transcription into a second language is generated using a machine translation module. The translation is generated with translation iterations that translate increasing amounts of the transcription, including re-translating previously portions of the transcription. A series of translation updates are provided to a client device based on the translation iterations.
-
公开(公告)号:US20220083746A1
公开(公告)日:2022-03-17
申请号:US17459041
申请日:2021-08-27
Applicant: Google LLC
Inventor: Zhifeng Chen , Macduff Richard Hughes , Yonghui Wu , Michael Schuster , Xu Chen , Llion Owen Jones , Niki J. Parmar , George Foster , Orhan Firat , Ankur Bapna , Wolfgang Macherey , Melvin Jose Johnson Premkumar
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for machine translation using neural networks. In some implementations, a text in one language is translated into a second language using a neural network model. The model can include an encoder neural network comprising a plurality of bidirectional recurrent neural network layers. The encoding vectors are processed using a multi-headed attention module configured to generate multiple attention context vectors for each encoding vector. A decoder neural network generates a sequence of decoder output vectors using the attention context vectors. The decoder output vectors can represent distributions over various language elements of the second language, allowing a translation of the text into the second language to be determined based on the sequence of decoder output vectors.
-
公开(公告)号:US20210209315A1
公开(公告)日:2021-07-08
申请号:US17056554
申请日:2020-03-07
Applicant: Google LLC
Inventor: Ye Jia , Zhifeng Chen , Yonghui Wu , Melvin Johnson , Fadi Biadsy , Ron Weiss , Wolfgang Macherey
Abstract: The present disclosure provides systems and methods that train and use machine-learned models such as, for example, sequence-to-sequence models, to perform direct and text-free speech-to-speech translation. In particular, aspects of the present disclosure provide an attention-based sequence-to-sequence neural network which can directly translate speech from one language into speech in another language, without relying on an intermediate text representation.
-
公开(公告)号:US20190325308A1
公开(公告)日:2019-10-24
申请号:US16458506
申请日:2019-07-01
Applicant: Google LLC
Inventor: Junyoung Chung , Melvin Jose Johnson Premkumar , Michael Schuster , Wolfgang Macherey
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for performing multi-task learning. In one method a system obtains a respective set of training data for each of multiple machine learning tasks. For each of the machine learning tasks, the system configures a respective teacher machine learning model to perform the machine learning task by training the teacher machine learning model on the training data. The system trains a single student machine learning model to perform the multiple machine learning tasks using (i) the configured teacher machine learning models, and (ii) the obtained training data.
-
公开(公告)号:US20240420680A1
公开(公告)日:2024-12-19
申请号:US18337168
申请日:2023-06-19
Applicant: GOOGLE LLC
Inventor: Te I , Chris Kau , Jeffrey Robert Pitman , Robert Eric Genter , Qi Ge , Wolfgang Macherey , Dirk Ryan Padfield , Naveen Arivazhagan , Colin Cherry
Abstract: Implementations relate to a multimodal translation application that can provide an abridged version of a translation through an audio interface of a computing device, while simultaneously providing a verbatim textual translation at a display interface of the computing device. The application can provide these different versions of the translation in certain circumstances when, for example, the rate of speech of a person speaking to a user is relatively high compared to a preferred rate of speech of the user. For example, a comparison between phonemes of an original language speech and a translated language speech can be performed to determine whether the ratio satisfies a threshold for providing an audible abridged translation. A determination to provide the abridged translation can additionally or alternatively be based on a determined language of the speaker.
-
公开(公告)号:US20240020491A1
公开(公告)日:2024-01-18
申请号:US18374071
申请日:2023-09-28
Applicant: Google LLC
Inventor: Zhifeng Chen , Macduff Richard Hughes , Yonghui Wu , Michael Schuster , Xu Chen , Llion Owen Jones , Niki J. Parmar , George Foster , Orhan Firat , Ankur Bapna , Wolfgang Macherey , Melvin Jose Johnson Premkumar
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for machine translation using neural networks. In some implementations, a text in one language is translated into a second language using a neural network model. The model can include an encoder neural network comprising a plurality of bidirectional recurrent neural network layers. The encoding vectors are processed using a multi-headed attention module configured to generate multiple attention context vectors for each encoding vector. A decoder neural network generates a sequence of decoder output vectors using the attention context vectors. The decoder output vectors can represent distributions over various language elements of the second language, allowing a translation of the text into the second language to be determined based on the sequence of decoder output vectors.
-
-
-
-
-
-
-
-
-