-
公开(公告)号:US20220405490A1
公开(公告)日:2022-12-22
申请号:US17304202
申请日:2021-06-16
Applicant: Google LLC
Inventor: Sebastian Krause , Sascha Rothe , Jonathan Mallinson , Eric Malmi , Aliaksei Severyn
IPC: G06F40/58 , G06F40/253
Abstract: A method of training a text-generating model for grammatical error correction (GEC) includes obtaining a multilingual set of text samples where each text sample includes a monolingual textual representation of a respective sentence. The operations also include, for each text sample of the multilingual set of text samples, generating a corrupted synthetic version of the respective text sample where the corrupted synthetic version of the respective text sample includes a grammatical change to the monolingual textual representation of the respective sentence associated with the respective text sample. The operations further include training the text-generating model using a training set of sample pairs. Each sample pair in the training set of sample pairs includes one of the respective text samples of the multilingual set of text samples and the corresponding corrupted synthetic version of the one of the respective text samples of the multilingual set of text samples.
-
公开(公告)号:US20230376676A1
公开(公告)日:2023-11-23
申请号:US17751330
申请日:2022-05-23
Applicant: Google LLC
Inventor: Jonathan Stephen Mallinson , Aliaksei Severyn , Eric Emil Malmi , Jakub Adamek
IPC: G06F40/166 , G06F40/117 , G06F40/284 , G06F40/40
CPC classification number: G06F40/166 , G06F40/117 , G06F40/284 , G06F40/40
Abstract: Provided are improved machine learning-based text editing models. Specifically, example implementations include a flexible semi-auto-regressive text-editing approach for generation, designed to derive the maximum benefit from non-auto-regressive text-editing and autoregressive decoding. In contrast to conventional sequence-to-sequence (seq2seq) models, the proposed approach is fast at inference time, while being capable of modeling flexible input-output transformations.
-
公开(公告)号:US11881210B2
公开(公告)日:2024-01-23
申请号:US16867427
申请日:2020-05-05
Applicant: Google LLC
CPC classification number: G10L15/16 , G06N3/084 , G10L15/02 , G10L15/063 , G10L2015/025 , G10L2015/027
Abstract: A method for generating a prosodic representation includes receiving a text utterance having one or more words. Each word has at least one syllable having at least one phoneme. The method also includes generating, using a Bidirectional Encoder Representations from Transformers (BERT) model, a sequence of wordpiece embeddings and selecting an utterance embedding for the text utterance, the utterance embedding representing an intended prosody. Each wordpiece embedding is associated with one of the one or more words of the text utterance. For each syllable, using the selected utterance embedding and a prosody model that incorporates the BERT model, the method also includes generating a corresponding prosodic syllable embedding for the syllable based on the wordpiece embedding associated with the word that includes the syllable and predicting a duration of the syllable by encoding linguistic features of each phoneme of the syllable with the corresponding prosodic syllable embedding for the syllable.
-
公开(公告)号:US20210350795A1
公开(公告)日:2021-11-11
申请号:US16867427
申请日:2020-05-05
Applicant: Google LLC
Abstract: A method for generating a prosodic representation includes receiving a text utterance having one or more words. Each word has at least one syllable having at least one phoneme. The method also includes generating, using a Bidirectional Encoder Representations from Transformers (BERT) model, a sequence of wordpiece embeddings and selecting an utterance embedding for the text utterance, the utterance embedding representing an intended prosody. Each wordpiece embedding is associated with one of the one or more words of the text utterance. For each syllable, using the selected utterance embedding and a prosody model that incorporates the BERT model, the method also includes generating a corresponding prosodic syllable embedding for the syllable based on the wordpiece embedding associated with the word that includes the syllable and predicting a duration of the syllable by encoding linguistic features of each phoneme of the syllable with the corresponding prosodic syllable embedding for the syllable.
-
公开(公告)号:US20230342411A1
公开(公告)日:2023-10-26
申请号:US18000152
申请日:2022-03-09
Applicant: Google LLC
Inventor: Preyas Dalsukhbhai Popat , Gaurav Bhaskar Gite , John Blitzer , Jayant Madhavan , Aliaksei Severyn
IPC: G06F16/957 , G06F16/951
CPC classification number: G06F16/957 , G06F16/951
Abstract: Techniques of generating short answers for queries by a search engine include performing a training operation on a corpus of training data to train the score prediction engine, the corpus of training data including candidate passages providing short answers for display in callouts and remaining respective passages, from which a top scoring short answer is generated. In such implementations, the corpus of training data further includes the remaining respective passages and the respective titles of the candidate passage and remaining respective passages.
-
-
-
-