Invention Grant
- Patent Title: Synthesizing speech recognition training data
-
Application No.: US16704216Application Date: 2019-12-05
-
Publication No.: US11308938B2Publication Date: 2022-04-19
- Inventor: Maisy Wieman , Jonah Probell , Sudharsan Krishnaswamy
- Applicant: SoundHound, Inc.
- Applicant Address: US CA Santa Clara
- Assignee: SoundHound, Inc.
- Current Assignee: SoundHound, Inc.
- Current Assignee Address: US CA Santa Clara
- Main IPC: G10L15/22
- IPC: G10L15/22 ; G10L15/06 ; G10L15/16 ; G10L15/18 ; G10L13/02 ; G10L15/197 ; G10L15/187

Abstract:
To train a speech recognizer, such as for recognizing variables in a neural speech-to-meaning system, compute, within an embedding space, a range of vectors of features of natural speech. Generate parameter sets for speech synthesis and synthesis speech according to the parameters. Analyze the synthesized speech to compute vectors in the embedding space. Using a cost function that favors an even spread (minimal clustering) generates a multiplicity of speech synthesis parameter sets. Using the multiplicity of parameter sets, generate a multiplicity of speech of known words that can be used as training data for speech recognition.
Public/Granted literature
- US20210174783A1 Synthesizing Speech Recognition Training Data Public/Granted day:2021-06-10
Information query