Invention Application
- Patent Title: Injecting Text in Self-Supervised Speech Pre-training
-
Application No.: US17808091Application Date: 2022-06-21
-
Publication No.: US20230017892A1Publication Date: 2023-01-19
- Inventor: Zhehuai Chen , Bhuvana Ramabhadran , Andrew M. Rosenberg , Yu Zhang , Pedro J. Moreno Mengibar
- Applicant: Google LLC
- Applicant Address: US CA Mountain View
- Assignee: Google LLC
- Current Assignee: Google LLC
- Current Assignee Address: US CA Mountain View
- Main IPC: G10L13/047
- IPC: G10L13/047 ; G10L13/08

Abstract:
A method includes receiving training data that includes unspoken text utterances and un-transcribed non-synthetic speech utterances. Each unspoken text utterance is not paired with any corresponding spoken utterance of non-synthetic speech. Each un-transcribed non-synthetic speech utterance is not paired with a corresponding transcription. The method also includes generating a corresponding synthetic speech representation for each unspoken textual utterance of the received training data using a text-to-speech model. The method also includes pre-training an audio encoder on the synthetic speech representations generated for the unspoken textual utterances and the un-transcribed non-synthetic speech utterances to teach the audio encoder to jointly learn shared speech and text representations.
Public/Granted literature
- US12159617B2 Injecting text in self-supervised speech pre-training Public/Granted day:2024-12-03
Information query