-
公开(公告)号:US20210201890A1
公开(公告)日:2021-07-01
申请号:US17095751
申请日:2020-11-12
Applicant: UBTECH ROBOTICS CORP LTD
Inventor: Ruotong Wang , Dongyan Huang , Xian Li , Jiebin Xie , Zhichao Tang , Wan Ding , Yang Liu , Bai Li , Youjun Xiong
Abstract: The present disclosure discloses a voice conversion training method. The method includes: forming a first training data set including a plurality of training voice data groups; selecting two of the training voice data groups from the first training data set to input into a voice conversion neural network for training; forming a second training data set including the first training data set and a first source speaker voice data group; inputting one of the training voice data groups selected from the first training data set and the first source speaker voice data group into the network for training; forming the third training data set including the second source speaker voice data group and the personalized voice data group that are parallel corpus with respect to each other; and inputting the second source speaker voice data group and the personalized voice data group into the network for training.
-
公开(公告)号:US11282503B2
公开(公告)日:2022-03-22
申请号:US17095751
申请日:2020-11-12
Applicant: UBTECH ROBOTICS CORP LTD
Inventor: Ruotong Wang , Dongyan Huang , Xian Li , Jiebin Xie , Zhichao Tang , Wan Ding , Yang Liu , Bai Li , Youjun Xiong
Abstract: The present disclosure discloses a voice conversion training method. The method includes: forming a first training data set including a plurality of training voice data groups; selecting two of the training voice data groups from the first training data set to input into a voice conversion neural network for training; forming a second training data set including the first training data set and a first source speaker voice data group; inputting one of the training voice data groups selected from the first training data set and the first source speaker voice data group into the network for training; forming the third training data set including the second source speaker voice data group and the personalized voice data group that are parallel corpus with respect to each other; and inputting the second source speaker voice data group and the personalized voice data group into the network for training.
-
3.
公开(公告)号:US20210201925A1
公开(公告)日:2021-07-01
申请号:US17110323
申请日:2020-12-03
Applicant: UBTECH ROBOTICS CORP LTD
Inventor: Jiebin Xie , Ruotong Wang , Dongyan Huang , Zhichao Tang , Yang Liu , Youjun Xiong
IPC: G10L21/013 , G10L15/02 , G10L25/03 , G10L25/69 , G10L15/04 , G10L13/033
Abstract: The present disclosure provides a streaming voice conversion method as well as an apparatus and a computer readable storage medium using the same. The method includes: obtaining to-be-converted voice data; partitioning the to-be-converted voice data in an order of data obtaining time as a plurality of to-be-converted partition voices, where the to-be-converted partition voice data carries a partition mark; performing a voice conversion on each of the to-be-converted partition voices to obtain a converted partition voice, where the converted partition voice carries a partition mark; performing a partition restoration on each of the converted partition voices to obtain a restored partition voice, where the restored partition voice carries a partition mark; and outputting each of the restored partition voices according to the partition mark carried by the restored partition voice. In this manner, the response time is shortened, and the conversion speed is improved.
-
公开(公告)号:US20210193160A1
公开(公告)日:2021-06-24
申请号:US17084672
申请日:2020-10-30
Applicant: UBTECH ROBOTICS CORP LTD.
Inventor: RUOTONG WANG , Zhichao Tang , Dongyan Huang , Jiebin Xie , Zhiyuan Zhao , Yang Liu , Youjun Xiong
IPC: G10L21/013 , G10L25/03 , G10L25/27 , G10L19/02 , G06N20/00
Abstract: The present disclosure discloses a voice conversion method. The method includes: obtaining a to-be-converted voice, and extracting acoustic features of the to-be-converted voice; obtaining a source vector corresponding to the to-be-converted voice from a source vector pool, and selecting a target vector corresponding to the target voice from the target vector pool; obtaining acoustic features of the target voice output by the voice conversion model by using the acoustic features of the to-be-converted voice, the source vector corresponding to the to-be-converted voice, and the target vector corresponding to the target voice as an input of the voice conversion model; and obtaining the target voice by converting the acoustic features of the target voice using a vocoder. In addition, a voice conversion apparatus and a storage medium are also provided.
-
公开(公告)号:US11996112B2
公开(公告)日:2024-05-28
申请号:US17084672
申请日:2020-10-30
Applicant: UBTECH ROBOTICS CORP LTD
Inventor: Ruotong Wang , Zhichao Tang , Dongyan Huang , Jiebin Xie , Zhiyuan Zhao , Yang Liu , Youjun Xiong
CPC classification number: G10L21/013 , G06N20/00 , G10L19/02 , G10L25/03 , G10L25/27 , G10L2021/0135
Abstract: The present disclosure discloses a voice conversion method. The method includes: obtaining a to-be-converted voice, and extracting acoustic features of the to-be-converted voice; obtaining a source vector corresponding to the to-be-converted voice from a source vector pool, and selecting a target vector corresponding to the target voice from the target vector pool; obtaining acoustic features of the target voice output by the voice conversion model by using the acoustic features of the to-be-converted voice, the source vector corresponding to the to-be-converted voice, and the target vector corresponding to the target voice as an input of the voice conversion model; and obtaining the target voice by converting the acoustic features of the target voice using a vocoder. In addition, a voice conversion apparatus and a storage medium are also provided.
-
6.
公开(公告)号:US11367456B2
公开(公告)日:2022-06-21
申请号:US17110323
申请日:2020-12-03
Applicant: UBTECH ROBOTICS CORP LTD
Inventor: Jiebin Xie , Ruotong Wang , Dongyan Huang , Zhichao Tang , Yang Liu , Youjun Xiong
IPC: G10L21/013 , G10L13/033 , G10L15/02 , G10L15/04 , G10L25/03 , G10L25/69
Abstract: The present disclosure provides a streaming voice conversion method as well as an apparatus and a computer readable storage medium using the same. The method includes: obtaining to-be-converted voice data; partitioning the to-be-converted voice data in an order of data obtaining time as a plurality of to-be-converted partition voices, where the to-be-converted partition voice data carries a partition mark; performing a voice conversion on each of the to-be-converted partition voices to obtain a converted partition voice, where the converted partition voice carries a partition mark; performing a partition restoration on each of the converted partition voices to obtain a restored partition voice, where the restored partition voice carries a partition mark; and outputting each of the restored partition voices according to the partition mark carried by the restored partition voice. In this manner, the response time is shortened, and the conversion speed is improved.
-
-
-
-
-