VOICE ACTIVITY DETECTION METHOD AND APPARATUS, AND STORAGE MEDIUM

    公开(公告)号:US20230186943A1

    公开(公告)日:2023-06-15

    申请号:US17893895

    申请日:2022-08-23

    CPC classification number: G10L25/78 G10L25/93 G10L2025/937

    Abstract: Provided are a voice activity detection method and apparatus, an electronic device and a storage medium, which relate to the technical field of voice processing, for example, to the technical field of artificial intelligence and deep learning. The specific implementation solution is described below. A first audio signal is acquired, and a frequency domain feature of the first audio signal is extracted; and the frequency domain feature of the first audio signal is input into a voice activity detection model, and a voice presence detection result output by the voice activity detection model is obtained, where the voice activity detection model is configured to detect whether voice is present in the first audio signal.

    VOICE NOISE REDUCTION METHOD, ELECTRONIC DEVICE, NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM

    公开(公告)号:US20230186933A1

    公开(公告)日:2023-06-15

    申请号:US18077307

    申请日:2022-12-08

    CPC classification number: G10L21/0216 G10L25/18

    Abstract: Provided are a voice noise reduction method, an electronic device, and a non-transitory computer-readable storage medium. The specific implementation scheme includes determining a to-be-denoised voice spectrum of a to-be-denoised voice signal; performing feature extraction on the to-be-denoised voice spectrum to obtain a local voice spectral feature of the to-be-denoised voice spectrum; determining a global voice spectral feature of the to-be-denoised voice spectrum according to the local voice spectral feature of the to-be-denoised voice spectrum; and determining a masking matrix of an original voice signal in the to-be-denoised voice signal according to the local voice spectral feature and the global voice spectral feature, and determining the original voice signal according to the to-be-denoised voice spectrum and the masking matrix.

    SPEECH ENHANCEMENT METHOD AND APPARATUS, AND STORAGE MEDIUM

    公开(公告)号:US20230186930A1

    公开(公告)日:2023-06-15

    申请号:US17890638

    申请日:2022-08-18

    CPC classification number: G10L21/0208 G10L25/30

    Abstract: A speech enhancement method includes steps as follows. Subband decomposition processing is performed on at least two paths of target speech to obtain amplitude spectrums and phase spectrums of the at least two paths of target speech, where the at least two paths of target speech include: target mixed speech and target interference speech; a prediction probability of the target mixed speech including target clean speech in a feature domain is determined according to the amplitude spectrums of the at least two paths of target speech; and subband synthesis processing is performed according to the prediction probability and the amplitude spectrums and the phase spectrums of the at least two paths of target speech to obtain the target clean speech in the target mixed speech.

Patent Agency Ranking