DYNAMIC NOISE SUPPRESSION AND OPERATIONS FOR NOISY SPEECH SIGNALS

    公开(公告)号:US20190206420A1

    公开(公告)日:2019-07-04

    申请号:US16226383

    申请日:2018-12-19

    摘要: Systems and methods for noise reduction are provided including operations for noisy speech signals, such as speech signals that are subject to speech processing, speech recognition and speech transmission for voice communication purposes. In one embodiment, a system for noise suppression includes an input smoothing filter to smooth magnitudes of the input spectrum, a desired noise shape determination block configured to determine a desired noise shape of the noise spectrum dependent on the smoothed-magnitude input spectrum, and a suppression factors determination block configured to determine a set of suppression factors based on the desired noise shape and the smoothed-magnitude input spectrum. In one embodiment, a filter coefficient determination block is configured to determine noise suppression filter coefficients from the desired noise shape of the noise spectrum. Embodiments are also directed to systems and methods for noise reduction. System configurations and processes are provided for formant detection.

    Adaptive audio enhancement for multichannel speech recognition

    公开(公告)号:US09886949B2

    公开(公告)日:2018-02-06

    申请号:US15392122

    申请日:2016-12-28

    申请人: Google Inc.

    摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for neural network adaptive beamforming for multichannel speech recognition are disclosed. In one aspect, a method includes the actions of receiving a first channel of audio data corresponding to an utterance and a second channel of audio data corresponding to the utterance. The actions further include generating a first set of filter parameters for a first filter based on the first channel of audio data and the second channel of audio data and a second set of filter parameters for a second filter based on the first channel of audio data and the second channel of audio data. The actions further include generating a single combined channel of audio data. The actions further include inputting the audio data to a neural network. The actions further include providing a transcription for the utterance.

    Noise reduction apparatus, noise reduction method, and noise reduction program

    公开(公告)号:US09691407B2

    公开(公告)日:2017-06-27

    申请号:US14461311

    申请日:2014-08-15

    摘要: A noise reduction apparatus according to the present invention includes: a sudden sound information storage unit that stores an input signal that are input before a current input signal is input as sudden sound information, the input signal having a signal level of voice components equal to or smaller than a predetermined threshold and including a sudden sound to be suppressed; a phase difference calculation unit that calculates a phase difference between the sudden sound information and a sudden sound in the current input signal based on a maximum value of a correlation value between the sudden sound information and the current input signal; an addition signal generation unit that shifts a phase of the sudden sound information based on the phase difference to generate an addition signal; and a sudden sound suppression unit that adds the addition signal and the current input signal to output an output signal.

    Source separation using nonnegative matrix factorization with an automatically determined number of bases
    9.
    发明授权
    Source separation using nonnegative matrix factorization with an automatically determined number of bases 有权
    源分离使用非负矩阵分解与自动确定的碱基数

    公开(公告)号:US09553681B2

    公开(公告)日:2017-01-24

    申请号:US14624220

    申请日:2015-02-17

    摘要: Methods and systems for source separation based on determining a number of bases for a nonnegative matrix factorization (NMF) model are disclosed. A method includes receiving, at a computing device, a mixed signal including a combination of first signal data and second signal data. The method also includes generating, by the computing device, a time-frequency representation of the mixed signal. The method further includes determining, by applying a structured stochastic variational inference (SSVI) algorithm to the NMF model, a number of bases for a dictionary of signal-related components of the mixed signal. The method uses the number of bases and the time-frequency representation to construct the dictionary and an activation matrix of weights, the weights indicating how active each one of the signal-related components is at a given time. The method then uses the dictionary and the activation matrix to separate the first signal data from the second signal data.

    摘要翻译: 公开了基于确定非负矩阵分解(NMF)模型的基数的源分离方法和系统。 一种方法包括在计算设备处接收包括第一信号数据和第二信号数据的组合的混合信号。 该方法还包括由计算设备产生混合信号的时间 - 频率表示。 该方法还包括通过将结构化随机变分推理(SSVI)算法应用于NMF模型来确定混合信号的信号相关分量词典的数量。 该方法使用基数和时间频率表示来构造词典和权重的激活矩阵,权重指示每个信号相关分量在给定时间的活跃度。 该方法然后使用字典和激活矩阵将第一信号数据与第二信号数据分离。

    Voice Activity Detection Method and Method Used for Voice Activity Detection and Apparatus Thereof
    10.
    发明申请
    Voice Activity Detection Method and Method Used for Voice Activity Detection and Apparatus Thereof 有权
    用于语音活动检测的语音活动检测方法和方法及其设备

    公开(公告)号:US20170004840A1

    公开(公告)日:2017-01-05

    申请号:US14754714

    申请日:2015-06-30

    申请人: ZTE CORPORATION

    摘要: The present document relates to a voice activity detection (VAD) method and methods used for voice activity detection and apparatus thereof, the VAD method includes: obtaining sub-band signals and spectrum amplitudes of a current frame; computing values of a energy feature and a spectral centroid feature of the current frame according to the sub-band signals; computing a signal to noise ratio parameter of the current frame according to a background noise energy estimated from a previous frame, an energy of SNR sub-bands and a energy feature of the current frame; computing a VAD decision result according to a tonality signal flag, a signal to noise ratio parameter, a spectral centroid feature, and a frame energy feature. The methods and apparatus of the present document can improve the accuracy of non-stationary noise (such as office noise) and music detection.

    摘要翻译: 本文件涉及用于语音活动检测的语音活动检测(VAD)方法和方法及其装置,所述VAD方法包括:获得当前帧的子带信号和频谱幅度; 根据子带信号计算当前帧的能量特征和光谱中心特征的值; 根据从前一帧估计的背景噪声能量,SNR子带的能量和当前帧的能量特征,计算当前帧的信噪比参数; 根据音调信号标志,信噪比参数,光谱中心特征和帧能量特征来计算VAD判定结果。 本文的方法和装置可以提高非平稳噪声(如办公室噪声)和音乐检测的准确性。