OVER-SUPPRESSION MITIGATION FOR DEEP LEARNING BASED SPEECH ENHANCEMENT

    公开(公告)号:US20240290341A1

    公开(公告)日:2024-08-29

    申请号:US18571963

    申请日:2022-06-28

    Abstract: A system for mitigating over-suppression of speech and other non-noise signals is disclosed. In some embodiments, a system is programmed to train a first machine learning model for speech detection or enhancement using a non-linear, asymmetric loss function that penalizes speech over-suppression more than speech under-suppression. The first machine learning model is configured to receive an audio signal and generate a mask indicating an amount of speech present in the audio signal. The mask can be adjusted to remedy sharp voice decay resulting from speech over-suppression. The system is also programmed to train a second machine learning model for laughter or applause detection. The system is further programmed to improve the quality of a new audio signal by applying an adjusted mask to the new audio signal except for the portions of the audio signal that have been identified as corresponding to laughter or applause.

    ADAPTING SIBILANCE DETECTION BASED ON DETECTING SPECIFIC SOUNDS IN AN AUDIO SIGNAL

    公开(公告)号:US20220383889A1

    公开(公告)日:2022-12-01

    申请号:US17627116

    申请日:2020-07-16

    Abstract: A method is disclosed herein for adapting parameters of a sibilance detector. Time-frequency features are extracted from an audio signal being received and. Based on those time-frequency features, a determination is made of whether the audio signal includes a short-term feature or a long-term feature. In accordance with determining that the audio signal includes the short-term feature or the long-term feature, one or more parameters of a sibilance detector for detecting sibilance in the audio signal are adapted. Sibilance in the audio signal, is detected using the sibilance detector with the one or more adapted parameters.

    CHANNEL IDENTIFICATION OF MULTI-CHANNEL AUDIO SIGNALS

    公开(公告)号:US20220319526A1

    公开(公告)日:2022-10-06

    申请号:US17639286

    申请日:2020-08-27

    Inventor: Yanmeng Guo Kai Li

    Abstract: A method for channel identification of a multi-channel audio signal comprising X>1 channels is provided. The method comprises the steps of: identifying, among the X channels, any empty channels, thus resulting in a subset of Y≤X non-empty channels; determining whether a low frequency effect (LFE) channel is present among the Y channels, and upon determining that an LFE channel is present, identifying the determined channel among the Y channels as the LFE channel; dividing the remaining channels among the Y channels not being identified as the LFE channel into any number of pairs of channels by matching symmetrical channels; and identifying any remaining unpaired channel among the Y channels not being identified as the LFE channel or divided into pairs as a center channel.

Patent Agency Ranking