HUM NOISE DETECTION AND REMOVAL FOR SPEECH AND MUSIC RECORDINGS

    公开(公告)号:WO2022023415A1

    公开(公告)日:2022-02-03

    申请号:PCT/EP2021/071148

    申请日:2021-07-28

    Inventor: YEH, Chunghsin

    Abstract: Described are methods of processing audio data for hum noise detection and/or removal. The audio data comprises a plurality of frames. One method incudes: classifying frames of the audio data as either content frames or noise frames, using one or more content activity detectors; determining a noise spectrum from one or more frames of the audio data that are classified as noise frames; determining one or more hum noise frequencies based on the determined noise spectrum; generating an estimated hum noise signal based on the one or more hum noise frequencies; and removing hum noise from at least one frame of the audio data based on the estimated hum noise signal. Also described are apparatus for carrying out the methods, as well as corresponding programs and computer-readable storage media.

    ADAPTIVE NOISE ESTIMATION
    3.
    发明申请

    公开(公告)号:WO2022066590A1

    公开(公告)日:2022-03-31

    申请号:PCT/US2021/051162

    申请日:2021-09-21

    Abstract: In some embodiments, a method, comprises: dividing, using at least one processor, an audio input into speech and non-speech segments; for each frame in each non-speech segment, estimating, using the at least one processor, a time-varying noise spectrum of the non-speech segment; for each frame in each speech segment, estimating, using the at least one processor, speech spectrum of the speech segment; for each frame in each speech segment, identifying one or more non-speech frequency components in the speech spectrum; comparing the one or more non-speech frequency components with one or more corresponding frequency components in a plurality of estimated noise spectra and selecting the estimated noise spectrum from the plurality of estimated noise spectra based on a result of the comparing.

    AUTOMATIC LEVELING OF SPEECH CONTENT
    4.
    发明申请

    公开(公告)号:WO2021195429A1

    公开(公告)日:2021-09-30

    申请号:PCT/US2021/024232

    申请日:2021-03-25

    Abstract: Embodiments are disclosed for automatic leveling of speech content. In an embodiment, a method comprises: receiving, using one or more processors, frames of an audio recording including speech and non-speech content; for each frame: determining, using the one or more processors, a speech probability; analyzing, using the one or more processors, a perceptual loudness of the frame; obtaining, using the one or more processors, a target loudness range for the frame; computing, using the one or more processors, gains to apply to the frame based on the target loudness range and the perceptual loudness analysis, where the gains include dynamic gains that change frame-by-frame and that are scaled based on the speech probability; and applying the gains to the frame so that a resulting loudness range of the speech content in the audio recording fits within the target loudness range.

Patent Agency Ranking