Multilingual speech synthesis and cross-language voice cloning

Invention Grant

US11580952B2 Multilingual speech synthesis and cross-language voice cloning 有权

Please log in to see more content

Patent Title: Multilingual speech synthesis and cross-language voice cloning
Application No.: US16855042

Application Date: 2020-04-22
Publication No.: US11580952B2

Publication Date: 2023-02-14
Inventor: Yu Zhang , Ron J. Weiss , Byungha Chun , Yonghui Wu , Zhifeng Chen , Russell John Wyatt Skerry-Ryan , Ye Jia , Andrew M. Rosenberg , Bhuvana Ramabhadran
Applicant: Google LLC
Applicant Address: US CA Mountain View
Assignee: Google LLC
Current Assignee: Google LLC
Current Assignee Address: US CA Mountain View
Agency: Honigman LLP
Agent Brett A. Krueger; Grant Griffith
Main IPC: G10L13/00
IPC: G10L13/00 ; G10L13/047

Multilingual speech synthesis and cross-language voice cloning

Abstract:

A method includes receiving an input text sequence to be synthesized into speech in a first language and obtaining a speaker embedding, the speaker embedding specifying specific voice characteristics of a target speaker for synthesizing the input text sequence into speech that clones a voice of the target speaker. The target speaker includes a native speaker of a second language different than the first language. The method also includes generating, using a text-to-speech (TTS) model, an output audio feature representation of the input text by processing the input text sequence and the speaker embedding. The output audio feature representation includes the voice characteristics of the target speaker specified by the speaker embedding.

Public/Granted literature

US20200380952A1 MULTILINGUAL SPEECH SYNTHESIS AND CROSS-LANGUAGE VOICE CLONING Public/Granted day:2020-12-03

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L13/00	语音合成；文本-语音合成系统