摘要:
A system comprising a pick-up device (5) connected to an acoustic/phonetic decoding device (11) which is in turn connected to a recognition supervisor device (6) and a voice recognition device (3) itself linked to a dictionary (12), to the recognition supervisor (6) and to a syntax describing device (7) which is linked to a dialogue storing device (8) connected to said supervisor (6).
摘要:
A speech analysis system (10) incorporates a filterbank analyser (18) producing successive frequency data vectors for a speech signal from two speakers. From each data vector, units (22A and 22B) produce a set of modified data vectors compensated for differing forms of distortion associated with respective speakers. A computer (24) matches modified data vectors to hidden Markov model states. It identifies the modified data vector in each set exhibiting greatest matching probability, the model state matched therewith, the form of distortion with which it is associated and the model class, i.e. speech or noise. The matched model state has a mean value providing an estimate of its associated data vector. The estimate is compared with its associated data vector, and their difference is averaged with others associated with a like form of distortion in an infinite response filter bank (48A or 48B) to provide compensation for that form of distortion. Averaged difference vectors provide compensation for multiple forms of distortion associated with respective speakers.
摘要:
A sound recognizer uses a feature value normalization process to substantially increase the accuracy of recognizing acoustic signals in noise. The sound recognizer includes a feature vector device (110) which determines a number of feature values for a number of analysis frames, a min/max device (120) which determines a minimum and maximum feature value for each of a number of frequency bands, a normalizer (130) which normalizes each of the feature values with the minimum and maximum feature values resulting in normalized feature vectors, and a comparator (140) which compares the normalized feature vectors with template feature vectors to identify one of the template feature vectors that most resembles the normalized feature vectors.
摘要:
In a speech processor such as a speech recogniser, the problem of distortion of extracted features caused by adaption of the input automatic gain control (AGC) during feature extraction is solved by storing the AGC's gain coefficient along with the energy level of each extracted feature. At the end of the sampling period the stored gain coefficients are set equal to the minimum stored coefficient and the associated energy levels adjusted accordingly. The AGC circuit may comprise a digitally switched attenuator under the control of a microprocessor performing the speech recognition.
摘要:
A speech recognition method of recognizing an input speech in a noisy environment by using a plurality of clean speech models is provided. Each of the clean speech models has a clean speech feature parameter S representing a cepstrum parameter of a clean speech thereof. The speech recognition method has the processes of: detecting a noise feature parameter N representing a cepstrum parameter of a noise in the noisy environment, immediately before the input speech is input; detecting an input speech feature parameter X representing a cepstrum parameter of the input speech in the noisy environment; calculating a modified clean speech feature parameter Y according to a following equation: Y = k · S + (1-k) · N (0 where the "k" is a predetermined value corresponding to a signal-to-noise ratio in the noise environment; comparing the input speech feature parameter X with the modified clean speech feature parameter Y; and recognizing the input speech by repeatedly carrying out the calculating process and the comparing process with respect to the plurality of clean speech models.