Abstract:
A method and an apparatus for processing a sound signal in which a useful signal and an interference signal are specified, the sound signal being transformed into the frequency domain and a change in the profile of the frequency being represented by an envelope for at least one frequency over a time. By segmenting the envelope, a maximum is obtained for each segment, the smallest maximum, weighted by a factor, being subtracted from the sound signal. It is also possible to take account of the minimum for the purpose of reducing the interference signal.
Abstract:
An automated speech recognition filter is disclosed. The automated speech recognition filter device provides a speech signal to an automated speech platform that approximates an original speech signal as spoken into a transceiver by a user. In providing the speech signal, the automated speech recognition filter determines various models representative of a cumulative signal degradation of the original speech signal from various devices along a transmission signal path and a reception signal path between the transceiver and a device housing the filter. The automated speech platform can thereby provide an audio signal corresponding to a context of the original speech signal.
Abstract:
A voice recognition system for use with a communication system having an incoming line carrying an incoming signal from a first end to a second end operably attached to a speaker and the outgoing line carrying an outgoing signal from a microphone near the speaker. A first speech recognition unit (SRU) detects selected incoming words and a second SRU detect outgoing words. A comparator/signal generator compares the outgoing word with the incoming word and outputs the outgoing word when the outgoing word does not match the incoming word. The first SRU may be delayed relative to the second SRU. The SRU's may also search only for selected words in template, or may ignore words which are first detected by the other SRU. A signaler may also provide a signal indicating inclusion of one of the selected words in a known incoming signal with an SRU being responsive to that signal to ignore the included one command word in the template for a selected period of time.
Abstract:
A recognition system (10) incorporates a filterbank analyser (16) producing successive data vectors of energy values for twenty-six frequency intervals in a speech signal. A unit (18) compensates for spectral distortion in each vector. Compensated vectors undergo a transformation into feature vectors with twelve dimensions and are matched with hidden Markov model states in a computer (24). Each matched model state has a mean value which is an estimate of the speech feature vector. A match inverter (28) produces an estimate of the speech data vector in frequency space by a pseudo-inverse transformation. It includes information which will be lost in a later transformation to frequency space. The estimated data vector is compared with its associated speech signal data vector, and infinite impulse response filters (44) average their difference with others. Averaged difference vectors so produced are used by the unit (18) in compensation of speech signal data vectors.
Abstract:
An adaptive noise suppression system includes an input A/D converter, an analyzer, a filter, and a output D/A converter. The analyzer includes both feed-forward and feedback signal paths that allow it to compute a filtering coefficient, which is input to the filter. In these paths, feed-forward signal are processed by a signal to noise ratio estimator, a normalized coherence estimator, and a coherence mask. Also, feedback signals are processed by a auditory mask estimator. These two signal paths are coupled together via a noise suppression filter estimator. A method according to the present invention includes active signal processing to preserve speech-like signals and suppress incoherent noise signals. After a signal is processed in the feed-forward and feedback paths, the noise suppression filter estimator then outputs a filtering coefficient signal to the filter for filtering the noise out of the speech and noise digital signal.
Abstract:
A process for removing additive noise due to the influence of ambient circumstances in a real-time manner in order to improve the precision of speech recognition which is performed in a real-time manner includes a converting process for converting a selected speech model distribution into a representative distribution, combining a noise model with the converted to generate speech model a noise superimposed speech model, performing a first likelihood calculation to recognize an input speech by using the noise superimposed speech model, converting the noise superimposed speech model to a noise adapted distribution that retains the relationship of the selected speech model, and performing a second likelihood calculation to recognize the input speech by using the noise adapted distribution.
Abstract:
Improving voice recognition when there exist interference noises in a configuration with an electrically operated appliance, a voice input unit, and a voice processing unit that derives control signals for controlling functions of the appliance from spoken input instructions includes an operating status detection unit detecting the operating status of the household appliance or other noise sources and signals such detection results to the voice processing unit, the voice processing unit performing an interference noise correction only if a noise source is switched on.
Abstract:
A system and method for voice activity detection, in accordance with the invention includes the steps of inputting data including frames of speech and noise, and deciding if the frames of the input data include speech or noise by employing a log-likelihood ratio test statistic and pitch. The frames of the input data are tagged based on the log-likelihood ratio test statistic and pitch characteristics of the input data as being most likely noise or most likely speech. The tags are counted in a plurality of frames to determine if the input data is speech or noise.
Abstract:
Speech level measurement is particularly significant for successful echo compensation in telecommunications systems, for noise suppression in a noisy environment, for example in military vehicles, or in speech recognition and in speech coding and decoding systems. A method is indicated which permits speech levels measurement only if features of speech are recognized and interferences and speech pauses are filtered out for the measurement. To this end, speech and pause detectors and a mean value generator are utilized, the time behavior of which is largely adapted to the perception capability of the human ear. Briefly spoken vowels thus are well detected, while nasal sounds or consonants are suppressed in the case of falling levels. A speech level measuring device is indicated which provides very accurate results in a short adaptation period.
Abstract:
An audio processing device includes an analyzer and a filter. The analyzer extracts an envelope of a noise signal and derives therefrom noise envelope parameters. The filter has coefficients which vary in response to noise envelope parameters and filters a useful signal to form a filtered signal. The coefficients are varied so that the filter enhances frequency bands of the useful signal that correspond to frequency bands of the noise signal having a higher energy than a predetermined value.