摘要:
Reliable transcription error-checking algorithm that uses a word confidence score and a word duration probability to detect transcription errors for improved results through the automatic detection of transcription errors in a corpus. The transcription error-checking algorithm is combined model training so as to use a current model to detect transcription errors, remove utterances which contain incorrect transcription (or manually fix the found errors), and retrain the model. This process can be repeated for several iterations to obtain an improved speech recognition model. The speech model is employed to achieve speech-transcription alignment to obtain a word boundary. Speech recognizer is then utilized to generate a word-lattice. Using the word boundary and word lattice, error detection is computed using a word confidence score and a word duration probability.
摘要:
Reliable transcription error-checking algorithm that uses a word confidence score and a word duration probability to detect transcription errors for improved results through the automatic detection of transcription errors in a corpus. The transcription error-checking algorithm is combined model training so as to use a current model to detect transcription errors, remove utterances which contain incorrect transcription (or manually fix the found errors), and retrain the model. This process can be repeated for several iterations to obtain an improved speech recognition model. The speech model is employed to achieve speech-transcription alignment to obtain a word boundary. Speech recognizer is then utilized to generate a word-lattice. Using the word boundary and word lattice, error detection is computed using a word confidence score and a word duration probability.
摘要:
In a speech recognition system, tied-mixture hidden Markov models (HMMs) are used to match, in the maximum likelihood sense, the phonemes of spoken words given the acoustic input thereof. In a well known manner, such speech recognition requires computation of state observation likelihoods (SOLs). Because of the use of HMMs, each SOL computation involves a substantial number of Gaussian kernels and mixture component weights. In accordance with the invention, the number of Gaussian kernels is cut down to reduce the computational complexity and increase the efficiency of memory access to the kernels. For example, only the non-zero mixture component weights and the Gaussian kernels associated therewith are considered in the SOL computation. In accordance with an aspect of the invention, only a subset of the Gaussian kernels of significant values, regardless of the values of the associated mixture component weights, are considered in the SOL computation. In accordance with another aspect of the invention, at least some of the mixture component weights are quantized to reduce memory space needed to store them. As such, the computational complexity and memory access efficiency are further improved.
摘要:
In a speech recognition system for performing voice dialing, an inventive connected digit recognizer is employed to recognize a sequence of spoken digits. The inventive recognizer generates the maximum-likelihood digit sequence corresponding to the spoken sequence in accordance with the Viterbi algorithm. However, unlike a prior art connected digit recognizer, the inventive recognizer does not assume that a digit model in a sequence can be followed by any digit model with equal probability. Rather, the inventive recognizer takes into account, for each digit model being decided on, a conditional probability that that digit model would follow a given digit model preceding thereto.