-
公开(公告)号:US20190188268A1
公开(公告)日:2019-06-20
申请号:US16193387
申请日:2018-11-16
Applicant: Google LLC
Inventor: Quoc V. Le , Minh-Thang Luong , Ilya Sutskever , Oriol Vinyals , Wojciech Zaremba
CPC classification number: G06F17/2881 , G06F7/023 , G06F7/10 , G06F17/2735 , G06F17/2818 , G06F17/2827 , G06N3/0445 , G06N3/0454 , G10L15/02 , G10L15/16
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for neural translation systems with rare word processing. One of the methods is a method training a neural network translation system to track the source in source sentences of unknown words in target sentences, in a source language and a target language, respectively and includes deriving alignment data from a parallel corpus, the alignment data identifying, in each pair of source and target language sentences in the parallel corpus, aligned source and target words; annotating the sentences in the parallel corpus according to the alignment data and a rare word model to generate a training dataset of paired source and target language sentences; and training a neural network translation model on the training dataset.
-
公开(公告)号:US10936828B2
公开(公告)日:2021-03-02
申请号:US16193387
申请日:2018-11-16
Applicant: Google LLC
Inventor: Quoc V. Le , Minh-Thang Luong , Ilya Sutskever , Oriol Vinyals , Wojciech Zaremba
IPC: G06F17/28 , G06F40/56 , G06N3/04 , G06F40/44 , G06F40/45 , G06F40/242 , G06F7/02 , G06F7/10 , G10L15/02 , G10L15/16
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for neural translation systems with rare word processing. One of the methods is a method training a neural network translation system to track the source in source sentences of unknown words in target sentences, in a source language and a target language, respectively and includes deriving alignment data from a parallel corpus, the alignment data identifying, in each pair of source and target language sentences in the parallel corpus, aligned source and target words; annotating the sentences in the parallel corpus according to the alignment data and a rare word model to generate a training dataset of paired source and target language sentences; and training a neural network translation model on the training dataset.
-
公开(公告)号:US10133739B2
公开(公告)日:2018-11-20
申请号:US14921925
申请日:2015-10-23
Applicant: GOOGLE LLC
Inventor: Quoc V. Le , Minh-Thang Luong , Ilya Sutskever , Oriol Vinyals , Wojciech Zaremba
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for neural translation systems with rare word processing. One of the methods is a method training a neural network translation system to track the source in source sentences of unknown words in target sentences, in a source language and a target language, respectively and includes deriving alignment data from a parallel corpus, the alignment data identifying, in each pair of source and target language sentences in the parallel corpus, aligned source and target words; annotating the sentences in the parallel corpus according to the alignment data and a rare word model to generate a training dataset of paired source and target language sentences; and training a neural network translation model on the training dataset.
-
-