Reducing recording time when constructing a concatenative TTS voice using a reduced script and pre-recorded speech assets

Invention Grant

US08019605B2 Reducing recording time when constructing a concatenative TTS voice using a reduced script and pre-recorded speech assets 有权

Title translation: 使用减少的脚本和预录制的语音资源构建级联TTS语音时减少录制时间

Please log in to see more content

Patent Title: Reducing recording time when constructing a concatenative TTS voice using a reduced script and pre-recorded speech assets
Patent Title (中): 使用减少的脚本和预录制的语音资源构建级联TTS语音时减少录制时间
Application No.: US11748256

Application Date: 2007-05-14
Publication No.: US08019605B2

Publication Date: 2011-09-13
Inventor: Ciprian Agapi , Oscar J. Blass , Paritosh D. Patel , Roberto Vila
Applicant: Ciprian Agapi , Oscar J. Blass , Paritosh D. Patel , Roberto Vila
Applicant Address: US MA Burlington
Assignee: Nuance Communications, Inc.
Current Assignee: Nuance Communications, Inc.
Current Assignee Address: US MA Burlington
Agency: Wolf, Greenfield & Sacks, P.C.
Main IPC: G10L13/08
IPC: G10L13/08 ; G10L13/06

Reducing recording time when constructing a concatenative TTS voice using a reduced script and pre-recorded speech assets

Abstract:

The present invention discloses a system and a method for creating a reduced script, which is read by a voice talent to create a concatenative text-to-speech (TTS) voice. The method can automatically process pre-recorded audio to derive speech assets for a concatenative TTS voice. The pre-recording audio can include sets of recorded phrases used by a speech user interface (Sill). A set of unfulfilled speech assets needed for foil phonetic coverage of the concatenative TTS voice can be determined. A reduced script can be constructed that includes a set of phrases, which when read by a voice talent result in a reduced corpus. When the reduced corpus is automatically processed, a reduced set of speech assets result. The reduced set includes each of the unfulfilled speech assets. When this reduced corpus is combined with existing speech assets the result will be a voice with a complete set of speech assets.

Abstract(Chinese):

本发明公开了一种用于创建简化脚本的系统和方法，该脚本由语音天才读取以创建级联的文本到语音（TTS）语音。该方法可以自动处理预先录制的音频，以便为连续的TTS语音导出语音资源。预录音音频可以包括由语音用户界面（Sill）使用的记录短语集合。可以确定一连串的TTS语音的箔语音覆盖所需的一组未实现的语音资产。可以构造一个简化的脚本，其包括一组短语，当通过语音天赋读取时，会产生减少的语料库。当自动处理缩减的语料库时，会产生一组减少的语音资源。缩减的集合包括每个未实现的语音资产。当这种减少的语料库与现有语音资源相结合时，结果将是具有完整语音资产的语音。

Public/Granted literature

US20080288256A1 REDUCING RECORDING TIME WHEN CONSTRUCTING A CONCATENATIVE TTS VOICE USING A REDUCED SCRIPT AND PRE-RECORDED SPEECH ASSETS Public/Granted day:2008-11-20

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L13/00	语音合成；文本-语音合成系统
G10L13/08	.文本分析或文本以外的语音合成参数的产生，例如语义图翻译为音素、韵律产生、重音或声调测定