-
公开(公告)号:US10013997B2
公开(公告)日:2018-07-03
申请号:US14938816
申请日:2015-11-11
Applicant: Cirrus Logic Inc.
Inventor: Erik Sherwood , Carl Grundstrom
IPC: G10L21/0232 , G10L21/0208 , G10L21/0216 , G10L25/84
CPC classification number: G10L21/0232 , G10L21/0208 , G10L25/84 , G10L2021/02165
Abstract: A method for adjusting a degree of filtering applied to an audio signal includes modeling a probability density function (PDF) of a fast Fourier transform (FFT) coefficient of a primary channel and reference channel of the audio signal; maximizing at least one of PDFs to provide a discriminative relevance difference (DRD) between a noise magnitude estimate of the reference channel and a noise magnitude estimate of the primary channel. The method further includes emphasizing the primary channel when the spectral magnitude of the primary channel is stronger than the spectral magnitude of the reference channel; and deemphasizing the primary channel when the spectral magnitude of the reference channel is stronger than the spectral magnitude of the primary channel. The emphasizing and deemphasizing includes computing a multiplicative rescaling factor and applying the multiplicative rescaling factor to a gain computed in a prior stage of a speech enhancement filter chain when there is a prior stage, and directly applying a gain when there is no prior stage.
-
公开(公告)号:US20180240472A1
公开(公告)日:2018-08-23
申请号:US15960140
申请日:2018-04-23
Applicant: Cirrus Logic, Inc.
Inventor: Earl Vickers , Fredrick D. Geiger , Erik Sherwood
IPC: G10L21/0264 , G10L25/60 , G10L21/0224 , G10L25/84 , G10L25/78 , G10L25/30 , G10L15/06
CPC classification number: G10L21/0264 , G10L21/0224 , G10L25/30 , G10L25/60 , G10L25/78 , G10L25/84 , G10L2015/0636
Abstract: A “running range normalization” method includes computing running estimates of the range of values of features useful for voice activity detection (VAD) and normalizing the features by mapping them to a desired range. Running range normalization includes computation of running estimates of the minimum and maximum values of VAD features and normalizing the feature values by mapping the original range to a desired range. Smoothing coefficients are optionally selected to directionally bias a rate of change of at least one of the running estimates of the minimum and maximum values. The normalized VAD feature parameters are used to train a machine learning algorithm to detect voice activity and to use the trained machine learning algorithm to isolate or enhance the speech component of the audio data.
-