摘要:
A system and method are presented for selectively biased linear discriminant analysis in automatic speech recognition systems. Linear Discriminant Analysis (LDA) may be used to improve the discrimination between the hidden Markov model (HMM) tied-states in the acoustic feature space. The between-class and within-class covariance matrices may be biased based on the observed recognition errors of the tied-states, such as shared HMM states of the context dependent tri-phone acoustic model. The recognition errors may be obtained from a trained maximum-likelihood acoustic model utilizing the tied-states which may then be used as classes in the analysis.
摘要:
A system and method are presented for acoustic data selection of a particular quality for training the parameters of an acoustic model, such as a Hidden Markov Model and Gaussian Mixture Model, for example, in automatic speech recognition systems in the speech analytics field. A raw acoustic model may be trained using a given speech corpus and maximum likelihood criteria. A series of operations are performed, such as a forced Viterbi-alignment, calculations of likelihood scores, and phoneme recognition, for example, to form a subset corpus of training data. During the process, audio files of a quality that does not meet a criterion, such as poor quality audio files, may be automatically rejected from the corpus. The subset may then be used to train a new acoustic model.
摘要:
A system and method are presented for selectively biased linear discriminant analysis in automatic speech recognition systems. Linear Discriminant Analysis (LDA) may be used to improve the discrimination between the hidden Markov model (HMM) tied-states in the acoustic feature space. The between-class and within-class covariance matrices may be biased based on the observed recognition errors of the tied-states, such as shared HMM states of the context dependent tri-phone acoustic model. The recognition errors may be obtained from a trained maximum-likelihood acoustic model utilizing the tied-states which may then be used as classes in the analysis.
摘要:
A system and method are presented for acoustic data selection of a particular quality for training the parameters of an acoustic model, such as a Hidden Markov Model and Gaussian Mixture Model, for example, in automatic speech recognition systems in the speech analytics field. A raw acoustic model may be trained using a given speech corpus and maximum likelihood criteria. A series of operations are performed, such as a forced Viterbi-alignment, calculations of likelihood scores, and phoneme recognition, for example, to form a subset corpus of training data. During the process, audio files of a quality that does not meet a criterion, such as poor quality audio files, may be automatically rejected from the corpus. The subset may then be used to train a new acoustic model.