Voice conversion training method and server and computer readable storage medium

Invention Grant

US11282503B2 Voice conversion training method and server and computer readable storage medium 有权

Please log in to see more content

Patent Title: Voice conversion training method and server and computer readable storage medium
Application No.: US17095751

Application Date: 2020-11-12
Publication No.: US11282503B2

Publication Date: 2022-03-22
Inventor: Ruotong Wang , Dongyan Huang , Xian Li , Jiebin Xie , Zhichao Tang , Wan Ding , Yang Liu , Bai Li , Youjun Xiong
Applicant: UBTECH ROBOTICS CORP LTD
Applicant Address: CN Shenzhen
Assignee: UBTECH ROBOTICS CORP LTD
Current Assignee: UBTECH ROBOTICS CORP LTD
Current Assignee Address: CN Shenzhen
Main IPC: G10L15/06
IPC: G10L15/06 ; G06N3/08 ; G10L15/16 ; G10L15/30 ; G10L21/01 ; G10L25/18 ; G10L25/24 ; G10L21/003

Voice conversion training method and server and computer readable storage medium

Abstract:

The present disclosure discloses a voice conversion training method. The method includes: forming a first training data set including a plurality of training voice data groups; selecting two of the training voice data groups from the first training data set to input into a voice conversion neural network for training; forming a second training data set including the first training data set and a first source speaker voice data group; inputting one of the training voice data groups selected from the first training data set and the first source speaker voice data group into the network for training; forming the third training data set including the second source speaker voice data group and the personalized voice data group that are parallel corpus with respect to each other; and inputting the second source speaker voice data group and the personalized voice data group into the network for training.

Public/Granted literature

US20210201890A1 VOICE CONVERSION TRAINING METHOD AND SERVER AND COMPUTER READABLE STORAGE MEDIUM Public/Granted day:2021-07-01

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/06	.创建基准模板；训练语音识别系统，例如对说话者声音特征的适应（G10L15/14优先）