Building a text-to-speech system from a small amount of speech data

Invention Grant

US11335321B2 Building a text-to-speech system from a small amount of speech data 有权

Please log in to see more content

Patent Title: Building a text-to-speech system from a small amount of speech data
Application No.: US17005974

Application Date: 2020-08-28
Publication No.: US11335321B2

Publication Date: 2022-05-17
Inventor: Ye Jia , Byungha Chun , Yusuke Oda , Norman Casagrande , Tejas Iyer , Fan Luo , Russell John Wyatt Skerry-Ryan , Jonathan Shen , Yonghui Wu , Yu Zhang
Applicant: Google LLC
Applicant Address: US CA Mountain View
Assignee: Google LLC
Current Assignee: Google LLC
Current Assignee Address: US CA Mountain View
Agency: Honigman LLP
Agent Brett A. Krueger
Main IPC: G10L13/08
IPC: G10L13/08 ; G10L13/04 ; G10L13/033 ; G10L15/06

Building a text-to-speech system from a small amount of speech data

Abstract:

A method of building a text-to-speech (TTS) system from a small amount of speech data includes receiving a first plurality of recorded speech samples from an assortment of speakers and a second plurality of recorded speech samples from a target speaker where the assortment of speakers does not include the target speaker. The method further includes training a TTS model using the first plurality of recorded speech samples from the assortment of speakers. Here, the trained TTS model is configured to output synthetic speech as an audible representation of a text input. The method also includes re-training the trained TTS model using the second plurality of recorded speech samples from the target speaker combined with the first plurality of recorded speech samples from the assortment of speakers. Here, the re-trained TTS model is configured to output synthetic speech resembling speaking characteristics of the target speaker.

Public/Granted literature

US20220068256A1 Building a Text-to-Speech System from a Small Amount of Speech Data Public/Granted day:2022-03-03

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L13/00	语音合成；文本-语音合成系统
G10L13/08	.文本分析或文本以外的语音合成参数的产生，例如语义图翻译为音素、韵律产生、重音或声调测定