METHOD AND APPARATUS FOR CONTROLLING ENHANCEMENT OF LOW-BITRATE CODED AUDIO

    公开(公告)号:US20210327445A1

    公开(公告)日:2021-10-21

    申请号:US17270053

    申请日:2019-08-29

    Abstract: Described herein is a method of low-bitrate coding of audio data and generating enhancement metadata for controlling audio enhancement of the low-bitrate coded audio data at a decoder side, including the steps of: (a) core encoding original audio data at a low bitrate to obtain encoded audio data; (b) generating enhancement metadata to be used for controlling a type and/or amount of audio enhancement at the decoder side after core decoding the encoded audio data; and (c) outputting the encoded audio data and the enhancement metadata. Described is further an encoder configured to perform said method. Described is moreover a method for generating enhanced audio data from low-bitrate coded audio data based on enhancement metadata and a decoder configured to perform said method.

    Method and apparatus for controlling enhancement of low-bitrate coded audio

    公开(公告)号:US11929085B2

    公开(公告)日:2024-03-12

    申请号:US17270053

    申请日:2019-08-29

    CPC classification number: G10L19/24

    Abstract: Described herein is a method of low-bitrate coding of audio data and generating enhancement metadata for controlling audio enhancement of the low-bitrate coded audio data at a decoder side, including the steps of: (a) core encoding original audio data at a low bitrate to obtain encoded audio data; (b) generating enhancement metadata to be used for controlling a type and/or amount of audio enhancement at the decoder side after core decoding the encoded audio data; and (c) outputting the encoded audio data and the enhancement metadata. Described is further an encoder configured to perform said method. Described is moreover a method for generating enhanced audio data from low-bitrate coded audio data based on enhancement metadata and a decoder configured to perform said method.

    OVER-SUPPRESSION MITIGATION FOR DEEP LEARNING BASED SPEECH ENHANCEMENT

    公开(公告)号:US20240290341A1

    公开(公告)日:2024-08-29

    申请号:US18571963

    申请日:2022-06-28

    Abstract: A system for mitigating over-suppression of speech and other non-noise signals is disclosed. In some embodiments, a system is programmed to train a first machine learning model for speech detection or enhancement using a non-linear, asymmetric loss function that penalizes speech over-suppression more than speech under-suppression. The first machine learning model is configured to receive an audio signal and generate a mask indicating an amount of speech present in the audio signal. The mask can be adjusted to remedy sharp voice decay resulting from speech over-suppression. The system is also programmed to train a second machine learning model for laughter or applause detection. The system is further programmed to improve the quality of a new audio signal by applying an adjusted mask to the new audio signal except for the portions of the audio signal that have been identified as corresponding to laughter or applause.

Patent Agency Ranking