METHOD OF TRAINING SPEECH SYNTHESIS MODEL AND METHOD OF SYNTHESIZING SPEECH

    公开(公告)号:US20230178067A1

    公开(公告)日:2023-06-08

    申请号:US18074023

    申请日:2022-12-02

    CPC classification number: G10L13/047 G10L25/30

    Abstract: A method of training a speech synthesis method, a method of synthesizing a speech, a device and a storage medium are provided, which relate to a field of artificial intelligence technology, in particular to a field of speech synthesis technology. The specific implementation scheme includes: processing training data by using the speech synthesis model, so as to determine a content encoding sequence, a style encoding sequence, a timbre encoding vector, a noise environment vector and a target Mel spectrum sequence corresponding to the training data; determine a total loss value according to the content encoding sequence, the style encoding sequence, the timbre encoding vector, the noise environment vector and the target Mel spectrum sequence; and adjusting a parameter of the speech synthesis model according to the total loss value.

    SPEECH SYNTHESIS METHOD, AND ELECTRONIC DEVICE

    公开(公告)号:US20230005466A1

    公开(公告)日:2023-01-05

    申请号:US17820339

    申请日:2022-08-17

    Abstract: The disclosure provides a speech synthesis method, and an electronic device. The technical solution is described as follows. A text to be synthesized and speech features of a target user are obtained. Predicted first acoustic features based on the text to be synthesized and the speech features are obtained. A target template audio is obtained from a template audio library based on the text to be synthesized. Second acoustic features of the target template audio are extracted. Target acoustic features are generated by splicing the first acoustic features and the second acoustic features. Speech synthesis is performed on the text to be synthesized based on the target acoustic features and the speech features, to generate a target speech of the text to be synthesized.

    METHOD OF PROCESSING AUDIO DATA, ELECTRONIC DEVICE AND STORAGE MEDIUM

    公开(公告)号:US20230087531A1

    公开(公告)日:2023-03-23

    申请号:US18071187

    申请日:2022-11-29

    Abstract: A method of processing audio data, an electronic device, and a storage medium, which relates to a field of artificial intelligence, in particular to a field of speech processing technology. The method includes: processing spectral data of the audio data to obtain a first feature information; obtaining a fundamental frequency indication information according to the first feature information, wherein the fundamental frequency indication information indicates valid audio data of the first feature information and invalid audio data of the first feature information; obtaining a fundamental frequency information and a spectral energy information according to the first feature information and the fundamental frequency indication information; and obtaining a harmonic structure information of the audio data according to the fundamental frequency information and the spectral energy information.

Patent Agency Ranking