-
公开(公告)号:US12143797B2
公开(公告)日:2024-11-12
申请号:US18309145
申请日:2023-04-28
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Louis D. Fielder , Zhiwei Shuang , Grant A. Davidson , Xiguang Zheng , Mark S. Vinton
Abstract: The present disclosure relates to reverberation generation for headphone virtualization. A method of generating one or more components of a binaural room impulse response (BRIR) for headphone virtualization is described. In the method, directionally-controlled reflections are generated, wherein directionally-controlled reflections impart a desired perceptual cue to an audio input signal corresponding to a sound source location. Then at least the generated reflections are combined to obtain the one or more components of the BRIR. Corresponding system and computer program products are described as well.
-
2.
公开(公告)号:US12073828B2
公开(公告)日:2024-08-27
申请号:US17611121
申请日:2020-05-13
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Jundai Sun , Zhiwei Shuang , Lie Lu , Shaofan Yang , Jia Dai
Abstract: Described herein is a method for Convolutional Neural Network (CNN) based speech source separation, wherein the method includes the steps of: (a) providing multiple frames of a time-frequency transform of an original noisy speech signal; (b) inputting the time-frequency transform of said multiple frames into an aggregated multi-scale CNN having a plurality of parallel convolution paths; (c) extracting and outputting, by each parallel convolution path, features from the input time-frequency transform of said multiple frames; (d) obtaining an aggregated output of the outputs of the parallel convolution paths; and (e) generating an output mask for extracting speech from the original noisy speech signal based on the aggregated output. Described herein are further an apparatus for CNN based speech source separation as well as a respective computer program product comprising a computer-readable storage medium with instructions adapted to carry out said method when executed by a device having processing capability.
-
3.
公开(公告)号:US20230386500A1
公开(公告)日:2023-11-30
申请号:US18032325
申请日:2021-10-19
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: Jundai Sun , Lie Lu , Zhiwei Shuang
IPC: G10L25/30 , G06N3/0464 , G10L25/84
CPC classification number: G10L25/30 , G06N3/0464 , G10L25/84
Abstract: Systems, methods, and computer program products for audio processing based on convolutional neural network (CNN) are described. The CNN architecture may comprise a multi-scale input block and a multi-scale nested block. The multi-scale input block may be configured to receive input data and to generate a first downsampled input data set by downsampling the input data. The multi-scale nested block may comprise a first encoding layer configured to generate a first encoded data set by performing a convolution based on the input data. The multi-scale nested block may comprise a second encoding layer configured to generate a second encoded data set by performing a convolution based on the first downsampled input data set. Furthermore, the multi-scale nested block may comprise a first convolutional layer configured to generate a first output data set by upsampling the second encoded data set, concatenating the first encoded data set and the upsampled second encoded data set, and performing a convolution. The first convolutional layer may be nested between the encoding layers and decoding layers, thereby increasing the number of communication channels with the CNN and simplifying the underlying optimization problem.
-
公开(公告)号:US10560393B2
公开(公告)日:2020-02-11
申请号:US14654346
申请日:2013-12-19
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: Xuejing Sun , Zhiwei Shuang
IPC: H04L12/875 , H04L12/841
Abstract: Apparatus and methods for controlling a jitter buffer are described. In one embodiment, the apparatus for controlling a jitter buffer includes an inter-talkspurt delay jitter estimator for estimating an offset value of the delay of a first frame in the current talkspurt with respect to the delay of a latest anchor frame in a previous talkspurt, and a jitter buffer controller for adjusting a length of the jitter buffer based on a long term length of the jitter buffer for each frame and the offset value.
-
公开(公告)号:US20240355348A1
公开(公告)日:2024-10-24
申请号:US18685656
申请日:2022-08-23
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: Ziyu Yang , Zhiwei Shuang , Lie Lu
IPC: G10L21/0216 , G10L21/0264
CPC classification number: G10L21/0216 , G10L21/0264
Abstract: A method of audio processing includes classifying an audio signal as noise or as non-noise using a first model. For a noise signal. the audio signal is classified as user-generated content (UGC) noise or as professionally-generated content (PGC) noise using a second model. For a non-noise signal or PGC noise. the audio signal is processed using a first audio processing process. For UGC noise. the audio signal is processed using a second audio processing process.
-
6.
公开(公告)号:US20240161766A1
公开(公告)日:2024-05-16
申请号:US18282311
申请日:2022-03-17
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: Jundai Sun , Lie Lu , Zhiwei Shuang
IPC: G10L21/0232 , G10L25/30
CPC classification number: G10L21/0232 , G10L25/30
Abstract: Described is a method of processing an audio signal. The method includes a first step for applying enhancement to a first component of the audio signal and/or applying suppression to a second component of the audio signal relative to the first component, and a second step of modifying an output of the first step by applying a deep learning based model to the output of the first step, for perceptually improving the first component of the audio signal. Also described is an apparatus for carrying out the method, as well as corresponding programs and computer-readable storage media.
-
公开(公告)号:US20230328469A1
公开(公告)日:2023-10-12
申请号:US18309145
申请日:2023-04-28
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Louis D. Fielder , Zhiwei Shuang , Grant A. Davidson , Xiguang Zheng , Mark S. Vinton
CPC classification number: H04S3/004 , H04S7/302 , G10K15/08 , H04S2420/01 , H04S7/304 , H04S2400/01 , H04S5/005
Abstract: The present disclosure relates to reverberation generation for headphone virtualization. A method of generating one or more components of a binaural room impulse response (BRIR) for headphone virtualization is described. In the method, directionally-controlled reflections are generated, wherein directionally-controlled reflections impart a desired perceptual cue to an audio input signal corresponding to a sound source location. Then at least the generated reflections are combined to obtain the one or more components of the BRIR. Corresponding system and computer program products are described as well.
-
公开(公告)号:US10425763B2
公开(公告)日:2019-09-24
申请号:US15109541
申请日:2014-12-18
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Kuan-Chieh Yen , Dirk Jeroen Breebaart , Grant A. Davidson , Rhonda Wilson , David M. Cooper , Zhiwei Shuang
IPC: H04S7/00 , H04S3/00 , G10L19/008
Abstract: In some embodiments, virtualization methods for generating a binaural signal in response to channels of a multi-channel audio signal, which apply a binaural room impulse response (BRIR) to each channel including by using at least one feed-back delay network (FDN) to apply a common late reverberation to a downmix of the channels. In some embodiments, input signal channels are processed in a first processing path to apply to each channel a direct response and early reflection portion of a single-channel BRIR for the channel, and the downmix of the channels is processed in a second processing path including at least one FDN which applies the common late reverberation. Typically, the common late reverberation emulates collective macro attributes of late reverberation portions of at least some of the single-channel BRIRs. Other aspects are headphone virtualizers configured to perform any embodiment of the method.
-
公开(公告)号:US10149082B2
公开(公告)日:2018-12-04
申请号:US15550424
申请日:2016-02-11
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Louis D. Fielder , Zhiwei Shuang , Grant A. Davidson , Xiguang Zheng , Mark S. Vinton
Abstract: The present disclosure relates to reverberation generation for headphone virtualization. A method of generating one or more components of a binaural room impulse response (BRIR) for headphone virtualization is described. In the method, directionally-controlled reflections are generated, wherein directionally-controlled reflections impart a desired perceptual cue to an audio input signal corresponding to a sound source location. Then at least the generated reflections are combined to obtain the one or more components of the BRIR. Corresponding system and computer program products are described as well.
-
10.
公开(公告)号:US09668080B2
公开(公告)日:2017-05-30
申请号:US14899505
申请日:2014-06-17
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Xuejing Sun , Bin Cheng , Sen Xu , Zhiwei Shuang , Jun Wang
CPC classification number: H04S7/301 , H04R29/002 , H04R29/005 , H04R2430/20 , H04S3/02 , H04S7/308 , H04S2400/03 , H04S2400/15 , H04S2420/01 , H04S2420/11
Abstract: Embodiments of the present invention relate to adaptive audio content generation. Specifically, a method for generating adaptive audio content is provided. The method comprises extracting at least one audio object from channel-based source audio content, and generating the adaptive audio content at least partially based on the at least one audio object. Corresponding system and computer program product are also disclosed.
-
-
-
-
-
-
-
-
-