-
1.
公开(公告)号:US20220223144A1
公开(公告)日:2022-07-14
申请号:US17611121
申请日:2020-05-13
发明人: Jundai SUN , Zhiwei SHUANG , Lie LU , Shaofan YANG , Jia DAI
摘要: Described herein is a method for Convolutional Neural Network (CNN) based speech source separation, wherein the method includes the steps of: (a) providing multiple frames of a time-frequency transform of an original noisy speech signal; (b) inputting the time-frequency transform of said multiple frames into an aggregated multi-scale CNN having a plurality of parallel convolution paths; (c) extracting and outputting, by each parallel convolution path, features from the input time-frequency transform of said multiple frames; (d) obtaining an aggregated output of the outputs of the parallel convolution paths; and (e) generating an output mask for extracting speech from the original noisy speech signal based on the aggregated output. Described herein are further an apparatus for CNN based speech source separation as well as a respective computer program product comprising a computer-readable storage medium with instructions adapted to carry out said method when executed by a device having processing capability.
-
公开(公告)号:US20210056984A1
公开(公告)日:2021-02-25
申请号:US17050786
申请日:2019-04-24
发明人: Chunmao ZHANG , Lianwu CHEN , Ziyu YANG , Joshua Brandon LANDO , David Matthew FISCHER , Lie LU
摘要: An apparatus and method of blind detection of binauralized audio. If the input content is detected as binaural, a second binauralization may be avoided. In this manner, the user experience avoids audio artifacts introduced by multiple binauralizations.
-
公开(公告)号:US20190222951A1
公开(公告)日:2019-07-18
申请号:US16368574
申请日:2019-03-28
发明人: Alan J. SEEFELDT , Lie LU , Chen ZHANG
摘要: An audio processing system and method which calculates, based on spatial metadata of the audio object, a panning coefficient for each of the audio objects in relation to each of a plurality of predefined channel coverage zones. Converts the audio signal into submixes in relation to the predefined channel coverage zones based on the calculated panning coefficients and the audio objects. Each of the submixes indicating a sum of components of the plurality of the audio objects in relation to one of the predefined channel coverage zones. Generating a submix gain by applying an audio processing to each of the submix and controls an object gain applied to each of the audio objects. The object gain being as a function of the panning coefficients for each of the audio objects and the submix gains in relation to each of the predefined channel coverage zones.
-
公开(公告)号:US20180295464A1
公开(公告)日:2018-10-11
申请号:US16009164
申请日:2018-06-14
IPC分类号: H04S7/00 , G10L19/20 , G10L19/018 , G10L19/008 , G10L19/00 , H04S3/00
摘要: Diffuse or spatially large audio objects may be identified for special processing. A decorrelation process may be performed on audio signals corresponding to the large audio objects to produce decorrelated large audio object audio signals. These decorrelated large audio object audio signals may be associated with object locations, which may be stationary or time-varying locations. For example, the decorrelated large audio object audio signals may be rendered to virtual or actual speaker locations. The output of such a rendering process may be input to a scene simplification process. The decorrelation, associating and/or scene simplification processes may be performed prior to a process of encoding the audio data.
-
公开(公告)号:US20180262856A1
公开(公告)日:2018-09-13
申请号:US15538892
申请日:2016-02-09
发明人: Jun WANG , Lie LU , Lianwu CHEN , Mingqing HU
摘要: Example embodiments disclosed herein relates to upmixing of audio signals. A method of upmixing an audio signal is described. The method includes decomposing the audio signal into a diffuse signal and a direct signal, generating an audio bed at least in part based on the diffuse signal, the audio bed including a height channel, extracting an audio object from the direct signal, estimating metadata of the audio object, the metadata including height information of the audio object; and rendering the audio bed and the audio object as an upmixed audio signal, wherein the audio bed is rendered to a predefined position and the audio object is rendered according to the metadata. Corresponding system and computer program product are described as well.
-
公开(公告)号:US20180152803A1
公开(公告)日:2018-05-31
申请号:US15577510
申请日:2016-05-26
发明人: Alan J. SEEFELDT , Lie LU , Chen ZHANG
CPC分类号: H04S7/302 , G10L19/008 , H04S3/008 , H04S7/30 , H04S2400/11
摘要: An audio processing system and method which calculates, based on spatial metadata of the audio object, a panning coefficient for each of the audio objects in relation to each of a plurality of predefined channel coverage zones. Converts the audio signal into submixes in relation to the predefined channel coverage zones based on the calculated panning coefficients and the audio objects. Each of the submixes indicating a sum of components of the plurality of the audio objects in relation to one of the predefined channel coverage zones. Generating a submix gain by applying an audio processing to each of the submix and controls an object gain applied to each of the audio objects. The object gain being as a function of the panning coefficients for each of the audio objects and the submix gains in relation to each of the predefined channel coverage zones.
-
7.
公开(公告)号:US20180144759A1
公开(公告)日:2018-05-24
申请号:US15572067
申请日:2016-05-12
发明人: Lie LU , Mingqing HU
IPC分类号: G10L21/0308 , G10L25/18 , G10L19/008
CPC分类号: G10L21/0308 , G10L19/008 , G10L21/0264 , G10L21/0272 , G10L25/18
摘要: Example embodiments disclosed herein relate to audio source separation with source direction determined based on iterative weighted component analysis. A method of separating audio sources in audio content is disclosed. The audio content includes a plurality of channels. The method includes obtaining multiple data samples from multiple time-frequency tiles of the audio content. The method also includes analyzing the data samples to generate multiple components in a plurality of iterations, wherein each of the components indicates a direction with a variance of the data samples, and wherein in each of the plurality of iterations, each of the data samples is weighted with a weight that is determined based on a selected component from the multiple components. The method further includes determining a source direction of the audio content based on the selected component for separating an audio source from the audio content. Corresponding system and computer program product of separating audio sources in audio content are also disclosed.
-
公开(公告)号:US20170230024A1
公开(公告)日:2017-08-10
申请号:US15433486
申请日:2017-02-15
发明人: Lie LU , Jun WANG , Alan J. SEEFELDT , Mingqing HU
摘要: Equalizer controller and controlling method are disclosed. In one embodiment, an equalizer controller includes an audio classifier for identifying the audio type of an audio signal in real time; and an adjusting unit for adjusting an equalizer in a continuous manner based on the confidence value of the audio type as identified.
-
公开(公告)号:US20160150343A1
公开(公告)日:2016-05-26
申请号:US14900117
申请日:2014-06-17
发明人: Jun WANG , Lie LU , Mingqing HU , Dirk Jeroen BREEBAART , Nicolas R. TSINGOS
IPC分类号: H04S7/00 , G10L19/008 , G10L19/02
CPC分类号: H04S7/30 , G10L19/008 , G10L19/0204 , G10L19/20 , G10L21/0272 , H04S3/002 , H04S5/005 , H04S2400/11 , H04S2400/13 , H04S2400/15 , H04S2420/07
摘要: Embodiments of the present invention relate to adaptive audio content generation. Specifically, a method for generating adaptive audio content is provided. The method comprises extracting at least one audio object from channel-based source audio content, and generating the adaptive audio content at least partially based on the at least one audio object. Corresponding system and computer program product are also disclosed.
摘要翻译: 本发明的实施例涉及自适应音频内容生成。 具体地,提供了一种用于产生自适应音频内容的方法。 所述方法包括从基于频道的源音频内容中提取至少一个音频对象,以及至少部分地基于所述至少一个音频对象生成所述自适应音频内容。 还公开了相应的系统和计算机程序产品。
-
公开(公告)号:US20230215423A1
公开(公告)日:2023-07-06
申请号:US17921564
申请日:2021-05-03
IPC分类号: G10L15/08 , G10L21/0272
CPC分类号: G10L15/08 , G10L21/0272
摘要: Computer-implemented methods and devices for combined audio separation and classification are provided. An estimated separated signal is time gated based on a determination of an audio classifier of, at least in part, the original mix of signals before separation. Combined separation, classification, and time gating of both the estimated signal and a residual signal are also provided.
-
-
-
-
-
-
-
-
-