-
公开(公告)号:WO2022034139A1
公开(公告)日:2022-02-17
申请号:PCT/EP2021/072384
申请日:2021-08-11
Applicant: DOLBY INTERNATIONAL AB
Inventor: YEH, Chunghsin , CENGARLE, Giulio , DE BURGH, Mark David
IPC: G10L15/04 , G10L21/0264 , G10L21/034 , G10L25/93 , G10L21/0308 , G10L25/09 , G10L25/21 , G10L25/24 , G10L25/84 , G10L21/0316
Abstract: Described is a method of performing automatic audio enhancement on an input audio signal including at least one speech-articulation noise event. The method comprises: segmenting the input audio signal into a number of audio frames; obtaining at least one feature parameter from the audio frames; and determining, based at least in part on the obtained feature parameter, a respective type of the speech-articulation noise event and a respective time-frequency range associated with the speech-articulation noise event within the input audio signal.
-
公开(公告)号:WO2022023415A1
公开(公告)日:2022-02-03
申请号:PCT/EP2021/071148
申请日:2021-07-28
Applicant: DOLBY INTERNATIONAL AB
Inventor: YEH, Chunghsin
IPC: G10L21/0216 , G10L25/78 , G10L21/0232 , G10L21/0208
Abstract: Described are methods of processing audio data for hum noise detection and/or removal. The audio data comprises a plurality of frames. One method incudes: classifying frames of the audio data as either content frames or noise frames, using one or more content activity detectors; determining a noise spectrum from one or more frames of the audio data that are classified as noise frames; determining one or more hum noise frequencies based on the determined noise spectrum; generating an estimated hum noise signal based on the one or more hum noise frequencies; and removing hum noise from at least one frame of the audio data based on the estimated hum noise signal. Also described are apparatus for carrying out the methods, as well as corresponding programs and computer-readable storage media.
-
公开(公告)号:WO2022066590A1
公开(公告)日:2022-03-31
申请号:PCT/US2021/051162
申请日:2021-09-21
Inventor: SCAINI, Davide , YEH, Chunghsin , CENGARLE, Giulio , DE BURGH, Mark David
IPC: G10L21/0232 , G10L25/78
Abstract: In some embodiments, a method, comprises: dividing, using at least one processor, an audio input into speech and non-speech segments; for each frame in each non-speech segment, estimating, using the at least one processor, a time-varying noise spectrum of the non-speech segment; for each frame in each speech segment, estimating, using the at least one processor, speech spectrum of the speech segment; for each frame in each speech segment, identifying one or more non-speech frequency components in the speech spectrum; comparing the one or more non-speech frequency components with one or more corresponding frequency components in a plurality of estimated noise spectra and selecting the estimated noise spectrum from the plurality of estimated noise spectra based on a result of the comparing.
-
公开(公告)号:WO2021195429A1
公开(公告)日:2021-09-30
申请号:PCT/US2021/024232
申请日:2021-03-25
Inventor: YEH, Chunghsin , CENGARLE, Giulio , DE BURGH, Mark David
Abstract: Embodiments are disclosed for automatic leveling of speech content. In an embodiment, a method comprises: receiving, using one or more processors, frames of an audio recording including speech and non-speech content; for each frame: determining, using the one or more processors, a speech probability; analyzing, using the one or more processors, a perceptual loudness of the frame; obtaining, using the one or more processors, a target loudness range for the frame; computing, using the one or more processors, gains to apply to the frame based on the target loudness range and the perceptual loudness analysis, where the gains include dynamic gains that change frame-by-frame and that are scaled based on the speech probability; and applying the gains to the frame so that a resulting loudness range of the speech content in the audio recording fits within the target loudness range.
-
-
-