-
公开(公告)号:US11282503B2
公开(公告)日:2022-03-22
申请号:US17095751
申请日:2020-11-12
Applicant: UBTECH ROBOTICS CORP LTD
Inventor: Ruotong Wang , Dongyan Huang , Xian Li , Jiebin Xie , Zhichao Tang , Wan Ding , Yang Liu , Bai Li , Youjun Xiong
Abstract: The present disclosure discloses a voice conversion training method. The method includes: forming a first training data set including a plurality of training voice data groups; selecting two of the training voice data groups from the first training data set to input into a voice conversion neural network for training; forming a second training data set including the first training data set and a first source speaker voice data group; inputting one of the training voice data groups selected from the first training data set and the first source speaker voice data group into the network for training; forming the third training data set including the second source speaker voice data group and the personalized voice data group that are parallel corpus with respect to each other; and inputting the second source speaker voice data group and the personalized voice data group into the network for training.
-
公开(公告)号:US20210201890A1
公开(公告)日:2021-07-01
申请号:US17095751
申请日:2020-11-12
Applicant: UBTECH ROBOTICS CORP LTD
Inventor: Ruotong Wang , Dongyan Huang , Xian Li , Jiebin Xie , Zhichao Tang , Wan Ding , Yang Liu , Bai Li , Youjun Xiong
Abstract: The present disclosure discloses a voice conversion training method. The method includes: forming a first training data set including a plurality of training voice data groups; selecting two of the training voice data groups from the first training data set to input into a voice conversion neural network for training; forming a second training data set including the first training data set and a first source speaker voice data group; inputting one of the training voice data groups selected from the first training data set and the first source speaker voice data group into the network for training; forming the third training data set including the second source speaker voice data group and the personalized voice data group that are parallel corpus with respect to each other; and inputting the second source speaker voice data group and the personalized voice data group into the network for training.
-