Synthesizing speech recognition training data

Invention Grant

US11308938B2 Synthesizing speech recognition training data 有权

Please log in to see more content

Patent Title: Synthesizing speech recognition training data
Application No.: US16704216

Application Date: 2019-12-05
Publication No.: US11308938B2

Publication Date: 2022-04-19
Inventor: Maisy Wieman , Jonah Probell , Sudharsan Krishnaswamy
Applicant: SoundHound, Inc.
Applicant Address: US CA Santa Clara
Assignee: SoundHound, Inc.
Current Assignee: SoundHound, Inc.
Current Assignee Address: US CA Santa Clara
Main IPC: G10L15/22
IPC: G10L15/22 ; G10L15/06 ; G10L15/16 ; G10L15/18 ; G10L13/02 ; G10L15/197 ; G10L15/187

Synthesizing speech recognition training data

Abstract:

To train a speech recognizer, such as for recognizing variables in a neural speech-to-meaning system, compute, within an embedding space, a range of vectors of features of natural speech. Generate parameter sets for speech synthesis and synthesis speech according to the parameters. Analyze the synthesized speech to compute vectors in the embedding space. Using a cost function that favors an even spread (minimal clustering) generates a multiplicity of speech synthesis parameter sets. Using the multiplicity of parameter sets, generate a multiplicity of speech of known words that can be used as training data for speech recognition.

Public/Granted literature

US20210174783A1 Synthesizing Speech Recognition Training Data Public/Granted day:2021-06-10

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/22	.在语音识别过程中（例如在人机对话过程中）使用的程序