专利检索 ap:("DOLBY LABORATORIES LICENSING CORPORATION") AND inv:"Cong ZHOU" 第 1 页

1.

发明申请
SOUND AND VIDEO OBJECT TRACKING 审中-公开

公开(公告)号：US20170364752A1

公开(公告)日：2017-12-21

申请号：US15624475

申请日：2017-06-15

申请人： DOLBY LABORATORIES LICENSING CORPORATION

发明人： Cong ZHOU , Timo KUNKEL , Cristina Michel VASCO

IPC分类号： G06K9/00 , H04N13/00 , H04N5/92 , G06K9/46 , G11B27/031 , G11B27/10 , G10K11/34 , H04R1/32 , H04H60/04

CPC分类号： G06K9/00718 , G06K9/00255 , G06K9/00288 , G06K9/4671 , G10K1/38 , G10K11/34 , G10K2200/10 , G11B27/031 , G11B27/10 , H04H60/04 , H04H60/07 , H04H60/48 , H04H60/58 , H04H60/59 , H04H60/66 , H04N9/802 , H04N13/161 , H04N13/189 , H04R1/326 , H04R3/005 , H04S7/30 , H04S2400/11

摘要： Image data relating to real-world objects or persons is collected from a scene while collecting audio data relating to the real-world objects or persons from the same scene. The audio data is used to derive sound objects corresponding to the real-world objects or persons. The image data is used to derive video objects corresponding to the real-world objects or persons. Based on the sound objects and the video objects, candidate salient objects are generated. A salient object is selected from among the candidate salient objects. Perceptual enhancement operations are performed on the selected salient object.

2.

发明申请
Audio Capture for Aerial Devices 审中-公开

公开(公告)号：US20180234612A1

公开(公告)日：2018-08-16

申请号：US15785977

申请日：2017-10-17

申请人： Dolby Laboratories Licensing Corporation

发明人： Timo KUNKEL , Cong ZHOU , Vivek KUMAR , Remi S. AUDFRAY

IPC分类号： H04N5/232 , G06T7/70 , H04N5/04 , B64C39/02

CPC分类号： H04N5/23203 , B64C39/024 , B64C2201/027 , B64C2201/108 , B64C2201/127 , G06T7/70 , G06T2207/10016 , H04N5/04 , H04N5/23206 , H04N5/23216 , H04N5/23222 , H04N5/23293

摘要： Methods, systems, and computer program products for automatically positioning a content capturing device are disclosed. A vehicle, e.g., an UAV, carries the content capturing device, e.g., a camcorder. The UAV can position the content capturing device at a best location for viewing a subject based on one or more audio or visual cues. The UAV can follow movement of the subject to achieve best audio or visual effect. In some implementations, a controller device carried by the subject can generate one or more signals for the UAV to follow. The controller device may be coupled to a microphone that records audio. The signals can be used to temporally synchronize video captured at the UAV and audio captured by the microphone.

3.

发明公开
ADAPTIVE BLOCK SWITCHING WITH DEEP NEURAL NETWORKS 审中-公开

公开(公告)号：US20230386486A1

公开(公告)日：2023-11-30

申请号：US18248294

申请日：2021-10-15

申请人： Dolby Laboratories Licensing Corporation

发明人： Cong ZHOU , Grant A. DAVIDSON , Mark S. VINTON

IPC分类号： G10L19/022 , G10L25/30 , G10L19/032 , G10L19/04

CPC分类号： G10L19/022 , G10L19/04 , G10L19/032 , G10L25/30

摘要： The present invention relates to a method for predicting transform coefficients representing frequency content of an adaptive block length media signal, by receiving a frame and receiving block length information indicating a number of quantized transform coefficients for each block in the frame, the number of quantized transform coefficients being one of a first or second number, wherein the first number is greater than the second number, determining a first block has the second number of quantized transform coefficients, converting the first block into a converted block having the first number of quantized transform coefficients, conditioning a main neural network trained to predict at least one output variable given at least one conditioning variable, the at least one conditioning variable being based on information regarding the converted block and block length information for the first block, providing at least one predicted transform coefficients from an output stage of the main neural network.

4.

发明公开
METHOD AND APPARATUS FOR PROCESSING OF AUDIO USING A NEURAL NETWORK 审中-公开

公开(公告)号：US20230395086A1

公开(公告)日：2023-12-07

申请号：US18031790

申请日：2021-10-14

申请人： DOLBY LABORATORIES LICENSING CORPORATION

发明人： Mark S. VINTON , Cong ZHOU , Roy M. FEJGIN , Grant A. DAVIDSON

IPC分类号： G10L19/032 , G10L19/06

CPC分类号： G10L19/032 , G10L19/06

摘要： Described herein is a method of processing an audio signal using a neural network or using a first and a second neural network. Described is further a method of training said neural network or of jointly training a set of said first and said second neural network. Moreover, described is a method of obtaining and transmitting a latent feature space representation of a perceptual domain audio signal using a neural network and a method of obtaining an audio signal from a latent feature space representation of a perceptual domain audio signal using a neural network. Described are also respective apparatuses and computer program products.

5.

发明申请
SYSTEMS AND METHODS FOR ADAPTING HUMAN SPEAKER EMBEDDINGS IN SPEECH SYNTHESIS 有权

公开(公告)号：US20220335925A1

公开(公告)日：2022-10-20

申请号：US17636851

申请日：2020-08-18

申请人： DOLBY LABORATORIES LICENSING CORPORATION

发明人： Cong ZHOU , Xiaoyu LIU , Michael Getty HORGAN , Vivek Kumar

IPC分类号： G10L13/033 , G10L13/047

摘要： Novel methods and systems for adapting a voice cloning synthesizer for a new speaker using real speech data are disclosed. Utterances from one or more target speakers are parameterized and are used to initialize an embedding vector for use with a voice synthesizer, by means of clustering the utterance data and determining the centroid of the data, using a speaker identification neural network, and/or by finding the closest stored embedded vector to the utterance data.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类