LOW-LATENCY NOISE SUPPRESSION
    1.
    发明公开

    公开(公告)号:US20240331716A1

    公开(公告)日:2024-10-03

    申请号:US18611308

    申请日:2024-03-20

    摘要: A device includes one or more processors configured to obtain audio data representing one or more audio signals. The audio data includes a first segment and a second segment subsequent to the first segment. The one or more processors are configured to perform one or more transform operations on the first segment to generate frequency-domain audio data. The one or more processors are configured to provide input data based on the frequency-domain audio data as input to one or more machine-learning models to generate a noise-suppression output. The one or more processors are configured to perform one or more reverse transform operations on the noise-suppression output to generate time-domain filter coefficients. The one or more processors are configured to perform time-domain filtering of the second segment using the time-domain filter coefficients to generate a noise-suppressed output signal.

    SYSTEM AND METHOD FOR MASK-BASED NEURAL BEAMFORMING FOR MULTI-CHANNEL SPEECH ENHANCEMENT

    公开(公告)号:US20240331715A1

    公开(公告)日:2024-10-03

    申请号:US18457921

    申请日:2023-08-29

    IPC分类号: G10L21/0224

    摘要: A method includes receiving, during a first time window, a set of noisy audio signals from a plurality of audio input devices. The method also includes generating a noisy time-frequency representation based on the set of noisy audio signals. The method further includes providing the noisy time-frequency representation as an input to a mask estimation model trained to output a mask used to predict a clean time-frequency representation of clean speech audio from the noisy time-frequency representation. The method also includes determining beamforming filter weights based on the mask. The method further includes applying the beamforming filter weights to the noisy time-frequency representation to isolate the clean speech audio from the set of noisy audio signals. In addition, the method includes outputting the clean speech audio.

    AUDIO SEPARATION METHOD AND ELECTRONIC DEVICE FOR PERFORMING THE SAME

    公开(公告)号:US20240274147A1

    公开(公告)日:2024-08-15

    申请号:US18432572

    申请日:2024-02-05

    摘要: Provided is a method for separating one or more candidate audios in a sound source including the one or more candidate audios and a background sound, by using an audio separation system, the method including extracting a first audio feature from the sound source, extracting a background sound feature from the sound source, the background sound feature identifying a degree of association between the first audio feature and the background sound, generating a second audio feature based on the first audio feature, the background sound feature, and a background sound control parameter configured to control the background sound and generating one or more separated audios based on target information corresponding to the one or more candidate audios, the first audio feature, and the second audio feature in which the background sound is adjusted.

    Method and apparatus for post-processing audio signal, storage medium, and electronic device

    公开(公告)号:US12002484B2

    公开(公告)日:2024-06-04

    申请号:US17736797

    申请日:2022-05-04

    发明人: Yang Yu Yu Chen

    摘要: This application discloses a method and an apparatus for processing an audio signal. The method includes obtaining a first speech signal acquired by a first device; performing frame blocking on the first speech signal, to obtain multiple speech signal frames; converting the multiple speech signal frames into multiple first frequency domain signal frames; performing aliasing processing on a first sub-frequency domain signal frame among the multiple first frequency domain signal frames with a frequency lower than or equal to a target frequency threshold, and retaining a second sub-frequency domain signal frame among the multiple first frequency domain signal frames with a frequency higher than the target frequency threshold, to obtain multiple second frequency domain signal frames, the target frequency threshold being related to a sampling frequency of a second device; and performing frame fusion on the multiple second frequency domain signal frames, to obtain a second speech signal.

    Selection of quantization schemes for spatial audio parameter encoding

    公开(公告)号:US11996109B2

    公开(公告)日:2024-05-28

    申请号:US18146151

    申请日:2022-12-23

    发明人: Adriana Vasilache

    摘要: There is disclosed inter alia an apparatus for spatial audio signal encoding comprising means for receiving for each time frequency block of a sub band of an audio frame a spatial audio parameter comprising an azimuth and an elevation; determining a first distortion measure for the audio frame by determining a first distance measure for each time frequency block and summing the first distance measure for each time frequency block; determining a second distortion measure for the audio frame by determining a second distance measure for each time frequency block and summing the second distance measure for each time frequency block, and selecting either the first quantization scheme or the second quantization scheme for quantising the elevation and the azimuth for all time frequency blocks of the sub band of the audio frame, wherein the selecting is dependent on the first and second distortion measures.