摘要:
Techniques for processing audio signals include removing noise from the audio signals or otherwise clarifying the audio signals prior to outputting the audio signals. The disclosed techniques may employ minimum mean squared error (MMSE) analyses on audio signals received from a primary microphone and at least one reference microphone, and to techniques in which the MMSE analyses are used to reduce or eliminate noise from audio signals received by the primary microphone. Optionally, confidence intervals may be assigned to different frequency bands of an audio signal, with each confidence interval corresponding to a likelihood that its respective frequency band includes targeted audio, and each confidence interval representing a contribution of its respective frequency band in a reconstructed audio signal from which noise has been removed.
摘要:
A “running range normalization” method includes computing running estimates of the range of values of features useful for voice activity detection (VAD) and normalizing the features by mapping them to a desired range. Running range normalization includes computation of running estimates of the minimum and maximum values of VAD features and normalizing the feature values by mapping the original range to a desired range. Smoothing coefficients are optionally selected to directionally bias a rate of change of at least one of the running estimates of the minimum and maximum values. The normalized VAD feature parameters are used to train a machine learning algorithm to detect voice activity and to use the trained machine learning algorithm to isolate or enhance the speech component of the audio data.
摘要:
A “running range normalization” method includes computing running estimates of the range of values of features useful for voice activity detection (VAD) and normalizing the features by mapping them to a desired range. Running range normalization includes computation of running estimates of the minimum and maximum values of VAD features and normalizing the feature values by mapping the original range to a desired range. Smoothing coefficients are optionally selected to directionally bias a rate of change of at least one of the running estimates of the minimum and maximum values. The normalized VAD feature parameters are used to train a machine learning algorithm to detect voice activity and to use the trained machine learning algorithm to isolate or enhance the speech component of the audio data.
摘要:
A data interpretation and separation system for identifying data elements within a data set that have common features, and separating those data elements from other data elements not sharing such common features. Commonalities relative to methods and/or rates of change within a data set may be used to determine which elements share common features. Determining the commonalities may be performed autonomously by referencing data elements within the data set, and need not be matched against algorithmic or predetermined definitions. Interpreted and separated data may be used to reconstruct an output that includes only separated data. Such reconstruction may be non-destructive. Interpreted and separated data may also be used to retroactively build on existing element sets associated with a particular source.
摘要:
Techniques for processing audio signals include removing noise from the audio signals or otherwise clarifying the audio signals prior to outputting the audio signals. The disclosed techniques may employ minimum mean squared error (MMSE) analyses on audio signals received from a primary microphone and at least one reference microphone, and to techniques in which the MMSE analyses are used to reduce or eliminate noise from audio signals received by the primary microphone. Optionally, confidence intervals may be assigned to different frequency bands of an audio signal, with each confidence interval corresponding to a likelihood that its respective frequency band includes targeted audio, and each confidence interval representing a contribution of its respective frequency band in a reconstructed audio signal from which noise has been removed.