Patent search ap:("Dolby Laboratories Licensing Corporation") AND inv:"Zhiwei Shuang" Page 1

1.

发明授权
Reverberation generation for headphone virtualization 有权

公开(公告)号：US12143797B2

公开(公告)日：2024-11-12

申请号：US18309145

申请日：2023-04-28

Applicant: Dolby Laboratories Licensing Corporation

Inventor： Louis D. Fielder , Zhiwei Shuang , Grant A. Davidson , Xiguang Zheng , Mark S. Vinton

IPC: H04S3/00 , G10K15/08 , H04S5/00 , H04S7/00

Abstract: The present disclosure relates to reverberation generation for headphone virtualization. A method of generating one or more components of a binaural room impulse response (BRIR) for headphone virtualization is described. In the method, directionally-controlled reflections are generated, wherein directionally-controlled reflections impart a desired perceptual cue to an audio input signal corresponding to a sound source location. Then at least the generated reflections are combined to obtain the one or more components of the BRIR. Corresponding system and computer program products are described as well.

2.

发明授权
Method and apparatus for speech source separation based on a convolutional neural network 有权

公开(公告)号：US12073828B2

公开(公告)日：2024-08-27

申请号：US17611121

申请日：2020-05-13

Applicant: Dolby Laboratories Licensing Corporation

Inventor： Jundai Sun , Zhiwei Shuang , Lie Lu , Shaofan Yang , Jia Dai

IPC: G10L15/20 , G06N3/08 , G10L15/16 , G10L15/22 , G10L21/0308 , G10L25/18

CPC classification number: G10L15/20 , G06N3/08 , G10L15/16 , G10L15/22 , G10L21/0308 , G10L25/18

Abstract: Described herein is a method for Convolutional Neural Network (CNN) based speech source separation, wherein the method includes the steps of: (a) providing multiple frames of a time-frequency transform of an original noisy speech signal; (b) inputting the time-frequency transform of said multiple frames into an aggregated multi-scale CNN having a plurality of parallel convolution paths; (c) extracting and outputting, by each parallel convolution path, features from the input time-frequency transform of said multiple frames; (d) obtaining an aggregated output of the outputs of the parallel convolution paths; and (e) generating an output mask for extracting speech from the original noisy speech signal based on the aggregated output. Described herein are further an apparatus for CNN based speech source separation as well as a respective computer program product comprising a computer-readable storage medium with instructions adapted to carry out said method when executed by a device having processing capability.

3.

发明公开
METHOD AND APPARTUS FOR AUDIO PROCESSING USING A NESTED CONVOLUTIONAL NEURAL NETWORK ARCHITECHTURE 审中-公开

公开(公告)号：US20230386500A1

公开(公告)日：2023-11-30

申请号：US18032325

申请日：2021-10-19

Applicant: DOLBY LABORATORIES LICENSING CORPORATION

Inventor： Jundai Sun , Lie Lu , Zhiwei Shuang

IPC: G10L25/30 , G06N3/0464 , G10L25/84

CPC classification number: G10L25/30 , G06N3/0464 , G10L25/84

Abstract: Systems, methods, and computer program products for audio processing based on convolutional neural network (CNN) are described. The CNN architecture may comprise a multi-scale input block and a multi-scale nested block. The multi-scale input block may be configured to receive input data and to generate a first downsampled input data set by downsampling the input data. The multi-scale nested block may comprise a first encoding layer configured to generate a first encoded data set by performing a convolution based on the input data. The multi-scale nested block may comprise a second encoding layer configured to generate a second encoded data set by performing a convolution based on the first downsampled input data set. Furthermore, the multi-scale nested block may comprise a first convolutional layer configured to generate a first output data set by upsampling the second encoded data set, concatenating the first encoded data set and the upsampled second encoded data set, and performing a convolution. The first convolutional layer may be nested between the encoding layers and decoding layers, thereby increasing the number of communication channels with the CNN and simplifying the underlying optimization problem.

4.

发明授权
Controlling a jitter buffer 有权

公开(公告)号：US10560393B2

公开(公告)日：2020-02-11

申请号：US14654346

申请日：2013-12-19

Applicant: DOLBY LABORATORIES LICENSING CORPORATION

Inventor： Xuejing Sun , Zhiwei Shuang

IPC: H04L12/875 , H04L12/841

Abstract: Apparatus and methods for controlling a jitter buffer are described. In one embodiment, the apparatus for controlling a jitter buffer includes an inter-talkspurt delay jitter estimator for estimating an offset value of the delay of a first frame in the current talkspurt with respect to the delay of a latest anchor frame in a previous talkspurt, and a jitter buffer controller for adjusting a length of the jitter buffer based on a long term length of the jitter buffer for each frame and the offset value.

5.

发明公开
DETECTING ENVIRONMENTAL NOISE IN USER-GENERATED CONTENT 审中-公开

公开(公告)号：US20240355348A1

公开(公告)日：2024-10-24

申请号：US18685656

申请日：2022-08-23

Applicant: DOLBY LABORATORIES LICENSING CORPORATION

Inventor： Ziyu Yang , Zhiwei Shuang , Lie Lu

IPC: G10L21/0216 , G10L21/0264

CPC classification number: G10L21/0216 , G10L21/0264

Abstract: A method of audio processing includes classifying an audio signal as noise or as non-noise using a first model. For a noise signal. the audio signal is classified as user-generated content (UGC) noise or as professionally-generated content (PGC) noise using a second model. For a non-noise signal or PGC noise. the audio signal is processed using a first audio processing process. For UGC noise. the audio signal is processed using a second audio processing process.

6.

发明公开
ROBUSTNESS/PERFORMANCE IMPROVEMENT FOR DEEP LEARNING BASED SPEECH ENHANCEMENT AGAINST ARTIFACTS AND DISTORTION 审中-公开

公开(公告)号：US20240161766A1

公开(公告)日：2024-05-16

申请号：US18282311

申请日：2022-03-17

Applicant: DOLBY LABORATORIES LICENSING CORPORATION

Inventor： Jundai Sun , Lie Lu , Zhiwei Shuang

IPC: G10L21/0232 , G10L25/30

CPC classification number: G10L21/0232 , G10L25/30

Abstract: Described is a method of processing an audio signal. The method includes a first step for applying enhancement to a first component of the audio signal and/or applying suppression to a second component of the audio signal relative to the first component, and a second step of modifying an output of the first step by applying a deep learning based model to the output of the first step, for perceptually improving the first component of the audio signal. Also described is an apparatus for carrying out the method, as well as corresponding programs and computer-readable storage media.

7.

发明公开
REVERBERATION GENERATION FOR HEADPHONE VIRTUALIZATION 审中-公开

公开(公告)号：US20230328469A1

公开(公告)日：2023-10-12

申请号：US18309145

申请日：2023-04-28

Applicant: Dolby Laboratories Licensing Corporation

Inventor： Louis D. Fielder , Zhiwei Shuang , Grant A. Davidson , Xiguang Zheng , Mark S. Vinton

IPC: H04S3/00 , H04S7/00 , G10K15/08 , H04S5/00

CPC classification number: H04S3/004 , H04S7/302 , G10K15/08 , H04S2420/01 , H04S7/304 , H04S2400/01 , H04S5/005

Abstract: The present disclosure relates to reverberation generation for headphone virtualization. A method of generating one or more components of a binaural room impulse response (BRIR) for headphone virtualization is described. In the method, directionally-controlled reflections are generated, wherein directionally-controlled reflections impart a desired perceptual cue to an audio input signal corresponding to a sound source location. Then at least the generated reflections are combined to obtain the one or more components of the BRIR. Corresponding system and computer program products are described as well.

8.

发明授权
Generating binaural audio in response to multi-channel audio using at least one feedback delay network 有权

公开(公告)号：US10425763B2

公开(公告)日：2019-09-24

申请号：US15109541

申请日：2014-12-18

Applicant: Dolby Laboratories Licensing Corporation

Inventor： Kuan-Chieh Yen , Dirk Jeroen Breebaart , Grant A. Davidson , Rhonda Wilson , David M. Cooper , Zhiwei Shuang

IPC: H04S7/00 , H04S3/00 , G10L19/008

Abstract: In some embodiments, virtualization methods for generating a binaural signal in response to channels of a multi-channel audio signal, which apply a binaural room impulse response (BRIR) to each channel including by using at least one feed-back delay network (FDN) to apply a common late reverberation to a downmix of the channels. In some embodiments, input signal channels are processed in a first processing path to apply to each channel a direct response and early reflection portion of a single-channel BRIR for the channel, and the downmix of the channels is processed in a second processing path including at least one FDN which applies the common late reverberation. Typically, the common late reverberation emulates collective macro attributes of late reverberation portions of at least some of the single-channel BRIRs. Other aspects are headphone virtualizers configured to perform any embodiment of the method.

9.

发明授权
Reverberation generation for headphone virtualization 有权

公开(公告)号：US10149082B2

公开(公告)日：2018-12-04

申请号：US15550424

申请日：2016-02-11

Applicant: Dolby Laboratories Licensing Corporation

Inventor： Louis D. Fielder , Zhiwei Shuang , Grant A. Davidson , Xiguang Zheng , Mark S. Vinton

IPC: H04S3/00 , H04S7/00 , H04S5/00 , G10K15/08

Abstract: The present disclosure relates to reverberation generation for headphone virtualization. A method of generating one or more components of a binaural room impulse response (BRIR) for headphone virtualization is described. In the method, directionally-controlled reflections are generated, wherein directionally-controlled reflections impart a desired perceptual cue to an audio input signal corresponding to a sound source location. Then at least the generated reflections are combined to obtain the one or more components of the BRIR. Corresponding system and computer program products are described as well.

10.

发明授权
Method for generating a surround sound field, apparatus and computer program product thereof 有权

公开(公告)号：US09668080B2

公开(公告)日：2017-05-30

申请号：US14899505

申请日：2014-06-17

Applicant: Dolby Laboratories Licensing Corporation

Inventor： Xuejing Sun , Bin Cheng , Sen Xu , Zhiwei Shuang , Jun Wang

IPC: H04R5/02 , H04S7/00 , H04R29/00 , H04S3/02

CPC classification number: H04S7/301 , H04R29/002 , H04R29/005 , H04R2430/20 , H04S3/02 , H04S7/308 , H04S2400/03 , H04S2400/15 , H04S2420/01 , H04S2420/11

Abstract: Embodiments of the present invention relate to adaptive audio content generation. Specifically, a method for generating adaptive audio content is provided. The method comprises extracting at least one audio object from channel-based source audio content, and generating the adaptive audio content at least partially based on the at least one audio object. Corresponding system and computer program product are also disclosed.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification