-
公开(公告)号:US12211485B2
公开(公告)日:2025-01-28
申请号:US17820339
申请日:2022-08-17
Inventor: Zhengkun Gao , Junteng Zhang , Tao Sun , Lei Jia
IPC: G10L13/00 , G10L13/047 , G10L13/08
Abstract: The disclosure provides a speech synthesis method, and an electronic device. The technical solution is described as follows. A text to be synthesized and speech features of a target user are obtained. Predicted first acoustic features based on the text to be synthesized and the speech features are obtained. A target template audio is obtained from a template audio library based on the text to be synthesized. Second acoustic features of the target template audio are extracted. Target acoustic features are generated by splicing the first acoustic features and the second acoustic features. Speech synthesis is performed on the text to be synthesized based on the target acoustic features and the speech features, to generate a target speech of the text to be synthesized.
-
公开(公告)号:US20230131494A1
公开(公告)日:2023-04-27
申请号:US18086004
申请日:2022-12-21
Inventor: Xinyong ZHOU , Junteng Zhang , Tao Sun , Lei Jia
Abstract: A voice generating method and apparatus, an electronic device and a storage medium. The specific implementation solution includes: acquiring a text to be processed, and determining an associated text of the text to be processed; acquiring an associated prosodic feature of the associated text; determining an associated text feature of the associated text based on the text to be processed; determining a spectrum feature to be processed of the text to be processed based on the associated prosodic feature and the associated text feature; and generating a target voice corresponding to the text to be processed based on the spectrum feature to be processed.
-
公开(公告)号:US20220375453A1
公开(公告)日:2022-11-24
申请号:US17875529
申请日:2022-07-28
Inventor: Junteng Zhang , Jianmin Wu , Tao Sun , Lei Jia
IPC: G10L13/10 , G10L13/06 , G10L13/047 , G10L25/18 , G10L13/08
Abstract: A method for speech synthesis includes obtaining text to be synthesized and an identifier of a speaker, the text being written in a first language; obtaining pronunciation information of each character in the text; generating linguistic features of the text by performing feature extraction on the pronunciation information of each character in the text based on the first language; and obtaining a target speech in a second language other than the first language, by performing speech synthesis based on the linguistic features and the identifier of the speaker.
-
公开(公告)号:US12073822B2
公开(公告)日:2024-08-27
申请号:US18086004
申请日:2022-12-21
Inventor: Xinyong Zhou , Junteng Zhang , Tao Sun , Lei Jia
CPC classification number: G10L13/10 , G06F40/30 , G10L13/06 , G10L25/18 , G10L13/047 , G10L2013/105
Abstract: A voice generating method and apparatus, an electronic device and a storage medium. The specific implementation solution includes: acquiring a text to be processed, and determining an associated text of the text to be processed; acquiring an associated prosodic feature of the associated text; determining an associated text feature of the associated text based on the text to be processed; determining a spectrum feature to be processed of the text to be processed based on the associated prosodic feature and the associated text feature; and generating a target voice corresponding to the text to be processed based on the spectrum feature to be processed.
-
公开(公告)号:US11769482B2
公开(公告)日:2023-09-26
申请号:US17489616
申请日:2021-09-29
Inventor: Wenfu Wang , Tao Sun , Xilei Wang , Junteng Zhang , Zhengkun Gao , Lei Jia
Abstract: The present disclosure provides a method and apparatus of synthesizing a speech, a method and apparatus of training a speech synthesis model, an electronic device, and a storage medium. The method of synthesizing a speech includes acquiring a style information of a speech to be synthesized, a tone information of the speech to be synthesized, and a content information of a text to be processed; generating an acoustic feature information of the text to be processed, by using a pre-trained speech synthesis model, based on the style information, the tone information, and the content information of the text to be processed; and synthesizing the speech for the text to be processed, based on the acoustic feature information of the text to be processed.
-
-
-
-