Method of processing audio data, electronic device and storage medium

    公开(公告)号:US11984134B2

    公开(公告)日:2024-05-14

    申请号:US18071187

    申请日:2022-11-29

    CPC classification number: G10L25/18 G10L25/30

    Abstract: A method of processing audio data, an electronic device, and a storage medium, which relates to a field of artificial intelligence, in particular to a field of speech processing technology. The method includes: processing spectral data of the audio data to obtain a first feature information; obtaining a fundamental frequency indication information according to the first feature information, wherein the fundamental frequency indication information indicates valid audio data of the first feature information and invalid audio data of the first feature information; obtaining a fundamental frequency information and a spectral energy information according to the first feature information and the fundamental frequency indication information; and obtaining a harmonic structure information of the audio data according to the fundamental frequency information and the spectral energy information.

    METHOD AND APPARATUS FOR CONVERTING VOICE TIMBRE, METHOD AND APPARATUS FOR TRAINING MODEL, DEVICE AND MEDIUM

    公开(公告)号:US20230127787A1

    公开(公告)日:2023-04-27

    申请号:US18145326

    申请日:2022-12-22

    Abstract: A method and an apparatus for converting a voice timbre, and a method for training a model. The solution includes: obtaining a target acoustic feature by encoding a sample audio using an encoding branch in a voice timbre conversion model; obtaining a target text feature by performing feature extraction on a real text sequence labeled by the sample audio; training the encoding branch based on a difference between the target acoustic feature and the target text feature; obtaining a first spectrum feature having an original timbre by decoding the target text feature using a decoding branch in the voice timbre conversion model based on the original timbre corresponding to the identification information carried in the sample audio; obtaining a second spectrum feature by performing spectrum feature extraction on the sample audio; and training the decoding branch based on a difference between the first spectrum feature and the second spectrum feature.

    Speech synthesis method, and electronic device

    公开(公告)号:US12211485B2

    公开(公告)日:2025-01-28

    申请号:US17820339

    申请日:2022-08-17

    Abstract: The disclosure provides a speech synthesis method, and an electronic device. The technical solution is described as follows. A text to be synthesized and speech features of a target user are obtained. Predicted first acoustic features based on the text to be synthesized and the speech features are obtained. A target template audio is obtained from a template audio library based on the text to be synthesized. Second acoustic features of the target template audio are extracted. Target acoustic features are generated by splicing the first acoustic features and the second acoustic features. Speech synthesis is performed on the text to be synthesized based on the target acoustic features and the speech features, to generate a target speech of the text to be synthesized.

Patent Agency Ranking