DYNAMIC RANGE COMPRESSION WITH REDUCED ARTIFACTS

    公开(公告)号:US20220322004A1

    公开(公告)日:2022-10-06

    申请号:US17642211

    申请日:2020-09-10

    Abstract: Methods for performing dynamic range compression (DRC) on audio in a manner intended to produce output audio for playback by systems or devices with limited power handling capabilities and preferably also to reduce or prevent undesirable artifacts (e.g., pumping and/or breathing) in the output audio. Some embodiments perform the DRC so as to maximize average loudness (while preventing loss of quieter elements) during playback, and also to reduce or prevent distortion. Other aspects are systems or devices configured to perform embodiments of the method. In some embodiments, reduced DRC is applied when average loudness of the input audio approaches (or matches or exceeds) a target (e.g., a knee point for DRC, or a signal level near to a maximum playback level of the intended playback system), since such input audio is assumed to have already been compressed, and otherwise applying full DRC to the input audio.

    DOUBLE TALK DETECTION USING CAPTURE UP-SAMPLING

    公开(公告)号:US20230115316A1

    公开(公告)日:2023-04-13

    申请号:US17906415

    申请日:2021-03-19

    Inventor: Ning WANG

    Abstract: A method of double talk detection includes using up-sampling. Audio signals received from the far end are up-sampled prior to output by the loudspeaker at the near end. The microphone at the near end captures audio at the up-sampled rate, and the audio output by the loudspeaker is detectable due to having no energy in the up-sampled frequency bands. The double talk detector uses this information to generate a signal for suppressing the echo of the far end audio from the captured audio signal that is transmitted to the far end.

    MACHINE LEARNING ASSISTED SPATIAL NOISE ESTIMATION AND SUPPRESSION

    公开(公告)号:US20230410829A1

    公开(公告)日:2023-12-21

    申请号:US18251876

    申请日:2021-11-04

    CPC classification number: G10L21/0232 G10L25/84 G10L25/30 G10L2021/02166

    Abstract: In an embodiment, a method comprises: receiving bands of power spectra of an input audio signal and a microphone covariance, and for each band: estimating, using a classifier, respective probabilities of speech and noise; estimating, using a directionality model, a set of means for speech and noise, or a set of means and covariances for speech and noise, based on the microphone covariance for the band and the probabilities; estimating, using a level model, a mean and covariance of noise power based on the probabilities and the power spectra; determining a first noise suppression gain based on the directionality model; determining a second noise suppression gain based on the level model; selecting the first or second noise suppression gain or their sum based on a signal-to-noise ratio of the input audio signal; and scaling a time-frequency representation of the input signal by the selected noise suppression gain.

Patent Agency Ranking