Patent search ap:("Google LLC") AND inv:"Arindrima Datta" Page 1

1.

发明授权
Large-scale multilingual speech recognition with a streaming end-to-end model 有权

公开(公告)号：US11468244B2

公开(公告)日：2022-10-11

申请号：US16834342

申请日：2020-03-30

Applicant: Google LLC

Inventor： Anjuli Patricia Kannan , Tara N. Sainath , Yonghui Wu , Ankur Bapna , Arindrima Datta

IPC: G10L15/00 , G06F40/40

Abstract: A method of transcribing speech using a multilingual end-to-end (E2E) speech recognition model includes receiving audio data for an utterance spoken in a particular native language, obtaining a language vector identifying the particular language, and processing, using the multilingual E2E speech recognition model, the language vector and acoustic features derived from the audio data to generate a transcription for the utterance. The multilingual E2E speech recognition model includes a plurality of language-specific adaptor modules that include one or more adaptor modules specific to the particular native language and one or more other adaptor modules specific to at least one other native language different than the particular native language. The method also includes providing the transcription for output.

2.

发明授权
Language-agnostic multilingual modeling using effective script normalization 有权

公开(公告)号：US11615779B2

公开(公告)日：2023-03-28

申请号：US17152760

申请日：2021-01-19

Applicant: Google LLC

Inventor： Arindrima Datta , Bhuvana Ramabhadran , Jesse Emond , Brian Roark

IPC: G10L15/00 , G06F40/58 , G06N3/04 , G10L15/06 , G10L15/16 , G10L15/26 , G06N3/049

Abstract: A method includes obtaining a plurality of training data sets each associated with a respective native language and includes a plurality of respective training data samples. For each respective training data sample of each training data set in the respective native language, the method includes transliterating the corresponding transcription in the respective native script into corresponding transliterated text representing the respective native language of the corresponding audio in a target script and associating the corresponding transliterated text in the target script with the corresponding audio in the respective native language to generate a respective normalized training data sample. The method also includes training, using the normalized training data samples, a multilingual end-to-end speech recognition model to predict speech recognition results in the target script for corresponding speech utterances spoken in any of the different native languages associated with the plurality of training data sets.

3.

发明申请
Language-agnostic Multilingual Modeling Using Effective Script Normalization 有权

公开(公告)号：US20210233510A1

公开(公告)日：2021-07-29

申请号：US17152760

申请日：2021-01-19

Applicant: Google LLC

Inventor： Arindrima Datta , Bhuvana Ramabhadran , Jesse Emond , Brian Roak

IPC: G10L15/00 , G06N3/04 , G10L15/16 , G10L15/26 , G06F40/58 , G10L15/06

Abstract: A method includes obtaining a plurality of training data sets each associated with a respective native language and includes a plurality of respective training data samples. For each respective training data sample of each training data set in the respective native language, the method includes transliterating the corresponding transcription in the respective native script into corresponding transliterated text representing the respective native language of the corresponding audio in a target script and associating the corresponding transliterated text in the target script with the corresponding audio in the respective native language to generate a respective normalized training data sample. The method also includes training, using the normalized training data samples, a multilingual end-to-end speech recognition model to predict speech recognition results in the target script for corresponding speech utterances spoken in any of the different native languages associated with the plurality of training data sets.

4.

发明公开
Language-agnostic Multilingual Modeling Using Effective Script Normalization 审中-公开

公开(公告)号：US20230223009A1

公开(公告)日：2023-07-13

申请号：US18187330

申请日：2023-03-21

Applicant: Google LLC

Inventor： Arindrima Datta , Bhuvana Ramabhadran , Jesse Emond , Brian Roark

IPC: G10L15/00 , G10L15/16 , G10L15/26 , G06F40/58 , G10L15/06

CPC classification number: G10L15/005 , G10L15/16 , G10L15/26 , G06F40/58 , G10L15/063 , G06N3/049

Abstract: A method includes obtaining a plurality of training data sets each associated with a respective native language and includes a plurality of respective training data samples. For each respective training data sample of each training data set in the respective native language, the method includes transliterating the corresponding transcription in the respective native script into corresponding transliterated text representing the respective native language of the corresponding audio in a target script and associating the corresponding transliterated text in the target script with the corresponding audio in the respective native language to generate a respective normalized training data sample. The method also includes training, using the normalized training data samples, a multilingual end-to-end speech recognition model to predict speech recognition results in the target script for corresponding speech utterances spoken in any of the different native languages associated with the plurality of training data sets.

5.

发明申请
Large-Scale Multilingual Speech Recognition With A Streaming End-To-End Model 审中-公开

公开(公告)号：US20200380215A1

公开(公告)日：2020-12-03

申请号：US16834342

申请日：2020-03-30

Applicant: Google LLC

Inventor： Anjuli Patricia Kannan , Tara N. Sainath , Yonghui Wu , Ankur Bapna , Arindrima Datta

IPC: G06F40/40 , G10L15/00

Abstract: A method of transcribing speech using a multilingual end-to-end (E2E) speech recognition model includes receiving audio data for an utterance spoken in a particular native language, obtaining a language vector identifying the particular language, and processing, using the multilingual E2E speech recognition model, the language vector and acoustic features derived from the audio data to generate a transcription for the utterance. The multilingual E2E speech recognition model includes a plurality of language-specific adaptor modules that include one or more adaptor modules specific to the particular native language and one or more other adaptor modules specific to at least one other native language different than the particular native language. The method also includes providing the transcription for output.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification