Invention Application
- Patent Title: SYSTEMS AND METHODS FOR ADAPTING HUMAN SPEAKER EMBEDDINGS IN SPEECH SYNTHESIS
-
Application No.: US17636851Application Date: 2020-08-18
-
Publication No.: US20220335925A1Publication Date: 2022-10-20
- Inventor: Cong ZHOU , Xiaoyu LIU , Michael Getty HORGAN , Vivek Kumar
- Applicant: DOLBY LABORATORIES LICENSING CORPORATION
- Applicant Address: US San Francisco
- Assignee: DOLBY LABORATORIES LICENSING CORPORATION
- Current Assignee: DOLBY LABORATORIES LICENSING CORPORATION
- Current Assignee Address: US San Francisco
- International Application: PCT/US2020/046723 WO 20200818
- Main IPC: G10L13/033
- IPC: G10L13/033 ; G10L13/047

Abstract:
Novel methods and systems for adapting a voice cloning synthesizer for a new speaker using real speech data are disclosed. Utterances from one or more target speakers are parameterized and are used to initialize an embedding vector for use with a voice synthesizer, by means of clustering the utterance data and determining the centroid of the data, using a speaker identification neural network, and/or by finding the closest stored embedded vector to the utterance data.
Public/Granted literature
- US11929058B2 Systems and methods for adapting human speaker embeddings in speech synthesis Public/Granted day:2024-03-12
Information query