专利检索 cpc:"G10L25/81" 第 1 页

1.

发明公开
SYSTEMS AND METHODS FOR AUTOMATICALLY GENERATING SOUND EVENT SUBTITLES 审中-公开

公开(公告)号：US20230412760A1

公开(公告)日：2023-12-21

申请号：US17841564

申请日：2022-06-15

申请人： Netflix, Inc.

发明人： Yadong Wang , Shilpa Jois Rao

IPC分类号： H04N5/93 , G10L15/00 , G10L15/04 , G10L15/26 , G10L25/57 , G10L25/81 , G10L15/22 , H04N5/278

CPC分类号： H04N5/9305 , G10L15/005 , G10L15/04 , G10L15/26 , G10L25/57 , G10L25/81 , G10L15/22 , H04N5/278

摘要： The disclosed computer-implemented method may include systems and methods for automatically generating sound event subtitles for digital videos. For example, the systems and methods described herein can automatically generate subtitles for sound events within a digital video soundtrack that includes sounds other than speech. Additionally, the systems and methods described herein can automatically generate sound event subtitles as part of an automatic and comprehensive approach that generates subtitles for all sounds within a soundtrack of a digital video—thereby avoiding the need for any manual inputs as part of the subtitling process.

2.

发明公开
VOCAL TRACK REMOVAL BY CONVOLUTIONAL NEURAL NETWORK EMBEDDED VOICE FINGER PRINTING ON STANDARD ARM EMBEDDED PLATFORM 审中-公开

公开(公告)号：US20230306943A1

公开(公告)日：2023-09-28

申请号：US18249913

申请日：2020-10-22

申请人： HARMAN INTERNATIONAL INDUSTRIES, INCORPORATED

发明人： Jianwen ZHENG , Shao-Fu SHIH , Kai LI , Cheng CHI

IPC分类号： G10H1/36 , G10L21/028 , G10L25/81 , G10L21/06 , G10L25/30

CPC分类号： G10H1/366 , G10L21/028 , G10L25/81 , G10L21/06 , G10L25/30

摘要： A vocal removal method and a system thereof are provided. In the vocal removal method, a voice separation model is generated and trained to process a real-time input music to separate the voice and the accompaniment. The vocal removal method further comprises the steps of feature extraction and reconstruction to obtain the voice minimized music.

3.

发明授权
Classification of audio signal as speech or music based on energy fluctuation of frequency spectrum 有权

公开(公告)号：US11756576B2

公开(公告)日：2023-09-12

申请号：US17692640

申请日：2022-03-11

申请人： Huawei Technologies Co., Ltd.

发明人： Zhe Wang

IPC分类号： G10L25/81 , G10L25/78 , G10L25/18 , G10L19/06 , G10L19/12

CPC分类号： G10L25/81 , G10L19/06 , G10L19/12 , G10L25/18 , G10L25/78 , G10L2025/783

摘要： An audio signal classification method includes determining, according to voice activity of a current audio frame, whether to obtain a frequency spectrum fluctuation of the current audio frame and store the frequency spectrum fluctuation in a frequency spectrum fluctuation memory, and updating, according to whether the audio frame is percussive music or activity of a historical audio frame, frequency spectrum fluctuations stored in the frequency spectrum fluctuation memory, and classifying the current audio frame as a speech frame or a music frame according to statistics of a part or all of effective data of the frequency spectrum fluctuations stored in the frequency spectrum fluctuation memory.

4.

发明公开
ROBUST AUDIO IDENTIFICATION WITH INTERFERENCE CANCELLATION 审中-公开

公开(公告)号：US20230196809A1

公开(公告)日：2023-06-22

申请号：US18172755

申请日：2023-02-22

申请人： Roku, Inc.

发明人： Jose Pio PEREIRA , Sunil Suresh KULKARNI , Mihailo M. STOJANCIC , Shashank MERCHANT , Peter WENDT

IPC分类号： G06V30/18 , G06T7/246 , G06T7/215 , G06F16/00 , G06T7/254 , G06F16/45 , G06F16/48 , G06V20/40 , G06F18/22 , G06V20/62 , G10L15/02 , G10L15/06 , G10L15/10 , G10L15/14 , G10L15/20 , G10L21/0232 , G10L25/81

CPC分类号： G06V30/18086 , G06T7/248 , G06T7/215 , G06F16/00 , G06T7/254 , G06F16/45 , G06F16/48 , G06V20/41 , G06F18/22 , G06V20/46 , G06V20/49 , G06V20/635 , G10L15/02 , G10L15/063 , G10L15/10 , G10L15/142 , G10L15/20 , G10L21/0232 , G10L25/81 , G06T2207/20004 , G06T2207/10016 , G06T2207/20224 , G06F16/906

摘要： Audio distortion compensation methods to improve accuracy and efficiency of audio content identification are described. The method is also applicable to speech recognition. Methods to detect the interference from speakers and sources, and distortion to audio from environment and devices, are discussed. Additional methods to detect distortion to the content after performing search and correlation are illustrated. The causes of actual distortion at each client are measured and registered and learnt to generate rules for determining likely distortion and interference sources. The learnt rules are applied at the client, and likely distortions that are detected are compensated or heavily distorted sections are ignored at audio level or signature and feature level based on compute resources available. Further methods to subtract the likely distortions in the query at both audio level and after processing at signature and feature level are described.

5.

发明申请
AUDIO CLASSIFIER THAT INCLUDES A FIRST PROCESSOR AND A SECOND PROCESSOR 审中-公开

公开(公告)号：US20180025732A1

公开(公告)日：2018-01-25

申请号：US15215259

申请日：2016-07-20

申请人： NXP B.V.

发明人： Ludovick Dominique Joel Lepauloux , Laurent Le Faucheur

IPC分类号： G10L17/22 , G10L25/51 , G10L25/81 , G10L25/84

CPC分类号： G10L17/22 , G10L25/09 , G10L25/21 , G10L25/51 , G10L25/81 , G10L25/84 , G10L2025/937

摘要： The disclosure relates to an audio classifier comprising: a first processor having hard-wired logic configured to receive an audio signal and detect audio activity from the audio signal; and a second processor having reconfigurable logic configured to classify the audio signal as a type of audio signal in response to the first processor detecting audio activity.

6.

发明授权
Apparatuses and methods for audio classifying and processing 有权

公开(公告)号：US09842605B2

公开(公告)日：2017-12-12

申请号：US14779322

申请日：2014-03-25

申请人： DOLBY LABORATORIES LICENSING CORPORATION

发明人： Lie Lu , Alan J. Seefeldt , Jun Wang

IPC分类号： G10L21/02 , G10L25/81 , G10L17/06

CPC分类号： G10L21/02 , G10L17/06 , G10L25/81

摘要： Apparatus and methods for audio classifying and processing are disclosed. In one embodiment, an audio processing apparatus includes an audio classifier for classifying an audio signal into at least one audio type in real time; an audio improving device for improving experience of audience; and an adjusting unit for adjusting at least one parameter of the audio improving device in a continuous manner based on the confidence value of the at least one audio type.

7.

发明申请
VOICE PROCESSING DEVICE 审中-公开

公开(公告)号：US20170352349A1

公开(公告)日：2017-12-07

申请号：US15536827

申请日：2015-12-24

申请人： AISIN SEIKI KABUSHIKI KAISHA

发明人： Sacha VRAZIC

IPC分类号： G10L15/20 , H04R3/00 , G10L15/28 , G10L25/84 , G10L25/81 , G10L21/0232 , H04R1/40 , G10L21/0216

CPC分类号： G10L15/20 , G10L15/00 , G10L15/28 , G10L21/0208 , G10L21/0232 , G10L25/81 , G10L25/84 , G10L2021/02085 , G10L2021/02087 , G10L2021/02166 , H04R1/406 , H04R3/005 , H04R2499/13

摘要： A voice processing device includes plural microphones 22 disposed in a vehicle, a voice source direction determination portion 16 determining a direction of a voice source by handling a sound reception signal as a spherical wave in a case where the voice source serving as a source of a voice included in the sound reception signal obtained by each of the plural microphones is disposed at a near field, the voice source direction determination portion determining the direction of the voice source by handling the sound reception signal as a plane wave in a case where the voice source is disposed at the far field, and a beamforming processing portion 12 performing beamforming so as to suppress a sound arriving from a direction range other than a direction range including the direction of the voice source.

8.

发明申请
ENHANCING AUDIO CONTENT FOR VOICE ISOLATION AND BIOMETRIC IDENTIFICATION 审中-公开

公开(公告)号：US20170316791A1

公开(公告)日：2017-11-02

申请号：US15603922

申请日：2017-05-24

申请人： Yobe, Inc

发明人： James Christopher Fairey , Kenneth Sutton

IPC分类号： G10L21/0364 , G10L21/0208 , G10L25/81 , H03G5/00

CPC分类号： G10L21/0364 , G06F21/32 , G10L17/02 , G10L21/0208 , G10L25/81 , H03G3/32 , H03G5/005 , H03G5/22 , H04R1/1008 , H04R1/1041 , H04R5/033

摘要： Systems and methods for isolating audio content and biometric authentication include receiving, with an audio receiver, an audio signal spanning a plurality of frequency bands, identifying a speech signal carried by a voice frequency band selected from the plurality of frequency bands, enhancing the speech signal relative to other audio content within the audio signal, and extracting a voice profile key that uniquely identifies the speech signal.

9.

发明申请
APPARATUS, PROCESS, AND PROGRAM FOR COMBINING SPEECH AND AUDIO DATA 审中-公开

公开(公告)号：US20170229114A1

公开(公告)日：2017-08-10

申请号：US15491468

申请日：2017-04-19

申请人： Sony Corporation

发明人： Tetsuo Ikeda , Ken Miyashita , Tatsushi Nashida

IPC分类号： G10L13/08 , G10L21/055 , G10L13/04

CPC分类号： G10L13/08 , G10L13/043 , G10L21/02 , G10L21/055 , G10L25/81

摘要： There is provided a speech processing apparatus including: a data obtaining unit which obtains music progression data defining a property of one or more time points or one or more time periods along progression of music; a determining unit which determines an output time point at which a speech is to be output during reproducing the music by utilizing the music progression data obtained by the data obtaining unit; and an audio output unit which outputs the speech at the output time point determined by the determining unit during reproducing the music.

10.

发明授权
Voice activity detection method and method used for voice activity detection and apparatus thereof 有权

公开(公告)号：US09672841B2

公开(公告)日：2017-06-06

申请号：US14754714

申请日：2015-06-30

申请人： ZTE Corporation

发明人： Dongping Jiang , Hao Yuan , Changbao Zhu

IPC分类号： G10L15/00 , G10L21/00 , G10L21/02 , G10L25/84 , G10L25/81 , G10L25/18 , G10L15/02 , G10L25/21 , G10L25/48 , G10L21/0224 , G10L21/0232

CPC分类号： G10L21/0205 , G10L15/02 , G10L21/0224 , G10L21/0232 , G10L25/18 , G10L25/21 , G10L25/48 , G10L25/78 , G10L25/81 , G10L25/84

摘要： The present document relates to a voice activity detection (VAD) method and methods used for voice activity detection and apparatus thereof, the VAD method includes: obtaining sub-band signals and spectrum amplitudes of a current frame; computing values of a energy feature and a spectral centroid feature of the current frame according to the sub-band signals; computing a signal to noise ratio parameter of the current frame according to a background noise energy estimated from a previous frame, an energy of SNR sub-bands and a energy feature of the current frame; computing a VAD decision result according to a tonality signal flag, a signal to noise ratio parameter, a spectral centroid feature, and a frame energy feature. The methods and apparatus of the present document can improve the accuracy of non-stationary noise (such as office noise) and music detection.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类