专利检索 cpc:"G10L21/0224" 第 1 页

1.

发明申请
DYNAMIC NOISE SUPPRESSION AND OPERATIONS FOR NOISY SPEECH SIGNALS 审中-公开

公开(公告)号：US20190206420A1

公开(公告)日：2019-07-04

申请号：US16226383

申请日：2018-12-19

申请人： Harman Becker Automotive Systems GmbH

发明人： Vasudev KANDADE RAJAN , Markus E. CHRISTOPH , Juergen Heinrich ZOLLNER

IPC分类号： G10L21/0264 , G10L21/0224 , G10L25/15 , G10L15/20

CPC分类号： G10L21/0264 , G10L15/20 , G10L21/0224 , G10L25/15

摘要： Systems and methods for noise reduction are provided including operations for noisy speech signals, such as speech signals that are subject to speech processing, speech recognition and speech transmission for voice communication purposes. In one embodiment, a system for noise suppression includes an input smoothing filter to smooth magnitudes of the input spectrum, a desired noise shape determination block configured to determine a desired noise shape of the noise spectrum dependent on the smoothed-magnitude input spectrum, and a suppression factors determination block configured to determine a set of suppression factors based on the desired noise shape and the smoothed-magnitude input spectrum. In one embodiment, a filter coefficient determination block is configured to determine noise suppression filter coefficients from the desired noise shape of the noise spectrum. Embodiments are also directed to systems and methods for noise reduction. System configurations and processes are provided for formant detection.

2.

发明申请
AUDIO PROCESSING SYSTEM, AUDIO PROCESSING DEVICE, AND AUDIO PROCESSING METHOD 审中-公开

公开(公告)号：US20190149916A1

公开(公告)日：2019-05-16

申请号：US16097935

申请日：2017-04-19

申请人： PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.

发明人： Shinichi TAKAYAMA

IPC分类号： H04R3/00 , H04R3/04 , H04R29/00 , G10L21/0224 , H04R5/04 , H04R5/027

CPC分类号： H04R3/005 , G10K11/17835 , G10L21/0224 , G10L2021/02166 , H04R3/00 , H04R3/04 , H04R5/027 , H04R5/04 , H04R29/00 , H04R29/002 , H04R29/005 , H04R2410/05 , H04R2420/05 , H04R2499/13

摘要： An audio processing system is provided with a speaker, a plurality of microphones, and an audio processing device. The audio processing device includes a plurality of filters that allow audio signals of audio collected by the plurality of microphones to pass any respective first bands included in a band of the audio output from the speaker, a plurality of delayers that delay the audio signals passed through the plurality of filters by delay times corresponding to the first bands respectively, a correlation value calculator that calculates a correlation value of a plurality of audio signals delayed respectively by the plurality of delayers and an audio signal of the audio output from the speaker, and a determinator that determines presence or absence of abnormality in the plurality of microphones and the speaker based on the correlation value.

3.

发明授权
Audio noise estimation and audio noise reduction using multiple microphones 有权

公开(公告)号：US09966067B2

公开(公告)日：2018-05-08

申请号：US13911915

申请日：2013-06-06

申请人： Apple Inc.

发明人： Vasu Iyengar , Sorin V. Dusan

IPC分类号： G10L21/00 , G10L15/20 , G10L21/0232 , G10L21/0224 , G10L25/78 , G10L21/0216

CPC分类号： G10L15/20 , G10L21/0224 , G10L21/0232 , G10L2021/02165 , G10L2025/783

摘要： Digital signal processing techniques for automatically reducing audible noise from a sound recording that contains speech. A noise suppression system uses two types of noise estimators, including a more aggressive one and less aggressive one. Decisions are made on how to select or combine their outputs into a usable noise estimate in a different speech and noise conditions. A 2-channel noise estimator is described. Other embodiments are also described and claimed.

4.

发明授权
Adaptive audio enhancement for multichannel speech recognition 有权

公开(公告)号：US09886949B2

公开(公告)日：2018-02-06

申请号：US15392122

申请日：2016-12-28

申请人： Google Inc.

发明人： Bo Li , Ron J. Weiss , Michiel A. U. Bacchiani , Tara N. Sainath , Kevin William Wilson

IPC分类号： G10L15/00 , G10L15/16 , G10L21/0224 , G10L21/0216 , G10L15/26

CPC分类号： G10L15/16 , G10L15/20 , G10L15/26 , G10L21/0224 , G10L2021/02166

摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for neural network adaptive beamforming for multichannel speech recognition are disclosed. In one aspect, a method includes the actions of receiving a first channel of audio data corresponding to an utterance and a second channel of audio data corresponding to the utterance. The actions further include generating a first set of filter parameters for a first filter based on the first channel of audio data and the second channel of audio data and a second set of filter parameters for a second filter based on the first channel of audio data and the second channel of audio data. The actions further include generating a single combined channel of audio data. The actions further include inputting the audio data to a neural network. The actions further include providing a transcription for the utterance.

5.

发明申请
SYSTEM AND METHOD FOR IDENTIFYING LANGUAGE REGISTER 审中-公开

公开(公告)号：US20180018987A1

公开(公告)日：2018-01-18

申请号：US15650942

申请日：2017-07-16

申请人： Ron Zass

发明人： Ron Zass

IPC分类号： G10L21/028 , G10L15/18 , H04R3/00 , H04R25/00

CPC分类号： H04R1/406 , A61B5/02055 , A61B5/1114 , A61B5/1128 , A61B5/16 , A61B2562/0204 , A61B2562/0219 , A61N1/36082 , G01N2800/28 , G06F17/20 , G06F17/21 , G06K9/00228 , G06K9/00275 , G06K9/00362 , G06K9/00369 , G06K9/46 , G10L15/1822 , G10L17/005 , G10L17/26 , G10L21/0205 , G10L21/0224 , G10L21/028 , G10L25/63 , G10L25/72 , G16H50/70 , H04R1/265 , H04R3/005 , H04R5/0335 , H04R25/407 , H04R2201/023 , H04R2225/43

摘要： System and method for analyzing audio data are provided. The audio data may be analyzed7 to identify language register. For example, the audio data may be analyzed to identify language register of a selected speaker, such as the language register of a wearer of a wearable audio sensor, of a speaker engaged in conversation with the wearer of the wearable audio sensor, and so forth. For example, the audio data may be analyzed to obtain textual information, and the textual information may be analyzed to identify the language register. Feedbacks and reports may be provided based on the identified language register.

6.

发明申请
SYSTEM AND METHOD FOR DETECTING ARTICULATION ERRORS 审中-公开

公开(公告)号：US20180018963A1

公开(公告)日：2018-01-18

申请号：US15650945

申请日：2017-07-16

申请人： Ron Zass

发明人： Ron Zass

IPC分类号： G10L15/18 , G06F17/21 , G10L21/028 , H04R3/00

CPC分类号： H04R1/406 , A61B5/02055 , A61B5/1114 , A61B5/1128 , A61B5/16 , A61B2562/0204 , A61B2562/0219 , A61N1/36082 , G01N2800/28 , G06F17/20 , G06F17/21 , G06K9/00228 , G06K9/00275 , G06K9/00362 , G06K9/00369 , G06K9/46 , G10L15/1822 , G10L17/005 , G10L17/26 , G10L21/0205 , G10L21/0224 , G10L21/028 , G10L25/63 , G10L25/72 , G16H50/70 , H04R1/265 , H04R3/005 , H04R5/0335 , H04R25/407 , H04R2201/023 , H04R2225/43

摘要： System and method for analyzing audio data are provided. The audio data may be analyzed to detect articulation errors. For example, the audio data may be analyzed to detect articulation errors of a selected speaker, such as articulation errors of a wearer of a wearable audio sensor, of a speaker engaged in conversation with the wearer of the wearable audio sensor, and so forth. Feedbacks and reports may be provided based on the detected articulation errors.

7.

发明授权
Noise reduction apparatus, noise reduction method, and noise reduction program 有权

公开(公告)号：US09691407B2

公开(公告)日：2017-06-27

申请号：US14461311

申请日：2014-08-15

申请人： JVC KENWOOD Corporation

发明人： Keisuke Oda , Takaaki Yamabe

IPC分类号： G10L15/00 , G10L21/00 , G10L21/02 , G10L19/00 , H04B15/00 , G10L21/0216 , G10L21/0224 , G10L21/0264

CPC分类号： G10L21/0216 , G10L21/0224 , G10L21/0264

摘要： A noise reduction apparatus according to the present invention includes: a sudden sound information storage unit that stores an input signal that are input before a current input signal is input as sudden sound information, the input signal having a signal level of voice components equal to or smaller than a predetermined threshold and including a sudden sound to be suppressed; a phase difference calculation unit that calculates a phase difference between the sudden sound information and a sudden sound in the current input signal based on a maximum value of a correlation value between the sudden sound information and the current input signal; an addition signal generation unit that shifts a phase of the sudden sound information based on the phase difference to generate an addition signal; and a sudden sound suppression unit that adds the addition signal and the current input signal to output an output signal.

8.

发明授权
Method and apparatus for enhancing the modulation index of speech sounds passed through a digital vocoder 有权

公开(公告)号：US09640185B2

公开(公告)日：2017-05-02

申请号：US14104777

申请日：2013-12-12

申请人： MOTOROLA SOLUTIONS, INC

发明人： William M Kushner , Robert J Novorita

IPC分类号： G10L21/02 , G10L21/00 , G10L19/00 , G10L19/02 , G10L21/0232 , G10L21/0224 , G10L19/26

CPC分类号： G10L19/0019 , G10L19/02 , G10L19/26 , G10L21/02 , G10L21/0224 , G10L21/0232

摘要： A method and apparatus for enhancing modulation of certain speech sounds, such as trill sounds, are provided for radios which utilize digital vocoders. A digitized speech stream is sampled and the sampling is adjusted to determine, detect and enhance trill nulls in the digitized voice stream by one or more of: frame shifting the digitized speech input stream prior to vocoding, time expanding a digitized speech steam prior to vocoding, time compressing a digitized speech output stream after vocoding, and/or modulation enhancement and filtering of the a digitized speech output stream after vocoding.

9.

发明授权
Source separation using nonnegative matrix factorization with an automatically determined number of bases 有权
标题翻译：源分离使用非负矩阵分解与自动确定的碱基数

公开(公告)号：US09553681B2

公开(公告)日：2017-01-24

申请号：US14624220

申请日：2015-02-17

申请人： Adobe Systems Incorporated

发明人： Matthew Douglas Hoffman

IPC分类号： H04B15/00 , G10L21/02 , G10L21/0224 , G10L21/0232 , G10L21/0208

CPC分类号： H04B15/00 , G10L21/02 , G10L21/0208 , G10L21/0224 , G10L21/0232 , G10L21/0272

摘要： Methods and systems for source separation based on determining a number of bases for a nonnegative matrix factorization (NMF) model are disclosed. A method includes receiving, at a computing device, a mixed signal including a combination of first signal data and second signal data. The method also includes generating, by the computing device, a time-frequency representation of the mixed signal. The method further includes determining, by applying a structured stochastic variational inference (SSVI) algorithm to the NMF model, a number of bases for a dictionary of signal-related components of the mixed signal. The method uses the number of bases and the time-frequency representation to construct the dictionary and an activation matrix of weights, the weights indicating how active each one of the signal-related components is at a given time. The method then uses the dictionary and the activation matrix to separate the first signal data from the second signal data.

摘要翻译： 公开了基于确定非负矩阵分解（NMF）模型的基数的源分离方法和系统。一种方法包括在计算设备处接收包括第一信号数据和第二信号数据的组合的混合信号。该方法还包括由计算设备产生混合信号的时间 - 频率表示。该方法还包括通过将结构化随机变分推理（SSVI）算法应用于NMF模型来确定混合信号的信号相关分量词典的数量。该方法使用基数和时间频率表示来构造词典和权重的激活矩阵，权重指示每个信号相关分量在给定时间的活跃度。该方法然后使用字典和激活矩阵将第一信号数据与第二信号数据分离。

10.

发明申请
Voice Activity Detection Method and Method Used for Voice Activity Detection and Apparatus Thereof 有权
标题翻译：用于语音活动检测的语音活动检测方法和方法及其设备

公开(公告)号：US20170004840A1

公开(公告)日：2017-01-05

申请号：US14754714

申请日：2015-06-30

申请人： ZTE CORPORATION

发明人： Dongping JIANG , Hao YUAN , Changbao ZHU

IPC分类号： G10L21/02 , G10L25/81 , G10L25/18 , G10L21/0232 , G10L25/21 , G10L25/48 , G10L21/0224 , G10L25/84 , G10L15/02

CPC分类号： G10L21/0205 , G10L15/02 , G10L21/0224 , G10L21/0232 , G10L25/18 , G10L25/21 , G10L25/48 , G10L25/78 , G10L25/81 , G10L25/84

摘要： The present document relates to a voice activity detection (VAD) method and methods used for voice activity detection and apparatus thereof, the VAD method includes: obtaining sub-band signals and spectrum amplitudes of a current frame; computing values of a energy feature and a spectral centroid feature of the current frame according to the sub-band signals; computing a signal to noise ratio parameter of the current frame according to a background noise energy estimated from a previous frame, an energy of SNR sub-bands and a energy feature of the current frame; computing a VAD decision result according to a tonality signal flag, a signal to noise ratio parameter, a spectral centroid feature, and a frame energy feature. The methods and apparatus of the present document can improve the accuracy of non-stationary noise (such as office noise) and music detection.

摘要翻译： 本文件涉及用于语音活动检测的语音活动检测（VAD）方法和方法及其装置，所述VAD方法包括：获得当前帧的子带信号和频谱幅度; 根据子带信号计算当前帧的能量特征和光谱中心特征的值; 根据从前一帧估计的背景噪声能量，SNR子带的能量和当前帧的能量特征，计算当前帧的信噪比参数; 根据音调信号标志，信噪比参数，光谱中心特征和帧能量特征来计算VAD判定结果。本文的方法和装置可以提高非平稳噪声（如办公室噪声）和音乐检测的准确性。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类