SYSTEMS AND METHODS FOR ADAPTING HUMAN SPEAKER EMBEDDINGS IN SPEECH SYNTHESIS

Invention Application

US20220335925A1 SYSTEMS AND METHODS FOR ADAPTING HUMAN SPEAKER EMBEDDINGS IN SPEECH SYNTHESIS 有权

Please log in to see more content

Patent Title: SYSTEMS AND METHODS FOR ADAPTING HUMAN SPEAKER EMBEDDINGS IN SPEECH SYNTHESIS
Application No.: US17636851

Application Date: 2020-08-18
Publication No.: US20220335925A1

Publication Date: 2022-10-20
Inventor: Cong ZHOU , Xiaoyu LIU , Michael Getty HORGAN , Vivek Kumar
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Applicant Address: US San Francisco
Assignee: DOLBY LABORATORIES LICENSING CORPORATION
Current Assignee: DOLBY LABORATORIES LICENSING CORPORATION
Current Assignee Address: US San Francisco
International Application: PCT/US2020/046723 WO 20200818
Main IPC: G10L13/033
IPC: G10L13/033 ; G10L13/047

SYSTEMS AND METHODS FOR ADAPTING HUMAN SPEAKER EMBEDDINGS IN SPEECH SYNTHESIS

Abstract:

Novel methods and systems for adapting a voice cloning synthesizer for a new speaker using real speech data are disclosed. Utterances from one or more target speakers are parameterized and are used to initialize an embedding vector for use with a voice synthesizer, by means of clustering the utterance data and determining the centroid of the data, using a speaker identification neural network, and/or by finding the closest stored embedded vector to the utterance data.

Public/Granted literature

US11929058B2 Systems and methods for adapting human speaker embeddings in speech synthesis Public/Granted day:2024-03-12

Information query

Global Dossier Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L13/00	语音合成；文本-语音合成系统
G10L13/02	.产生合成语音的方法；语音合成设备
G10L13/033	..声音编辑，例如操控合成设备的声音