摘要:
In a speech recognition system, the combination of a log-linear model with a multitude of speech features is provided to recognize unknown speech utterances. The speech recognition system models the posterior probability of linguistic units relevant to speech recognition using a log-linear model. The posterior model captures the probability of the linguistic unit given the observed speech features and the parameters of the posterior model. The posterior model may be determined using the probability of the word sequence hypotheses given a multitude of speech features. Log-linear models are used with features derived from sparse or incomplete data. The speech features that are utilized may include asynchronous, overlapping, and statistically non-independent speech features. Not all features used in training need to appear in testing/recognition.
摘要:
In a speech recognition system, the combination of a log-linear model with a multitude of speech features is provided to recognize unknown speech utterances. The speech recognition system models the posterior probability of linguistic units relevant to speech recognition using a log-linear model. The posterior model captures the probability of the linguistic unit given the observed speech features and the parameters of the posterior model. The posterior model may be determined using the probability of the word sequence hypotheses given a multitude of speech features. Log-linear models are used with features derived from sparse or incomplete data. The speech features that are utilized may include asynchronous, overlapping, and statistically non-independent speech features. Not all features used in training need to appear in testing/recognition.
摘要:
In a speech recognition system, the combination of a log-linear model with a multitude of speech features is provided to recognize unknown speech utterances. The speech recognition system models the posterior probability of linguistic units relevant to speech recognition using a log-linear model. The posterior model captures the probability of the linguistic unit given the observed speech features and the parameters of the posterior model. The posterior model may be determined using the probability of the word sequence hypotheses given a multitude of speech features. Log-linear models are used with features derived from sparse or incomplete data. The speech features that are utilized may include asynchronous, overlapping, and statistically non-independent speech features. Not all features used in training need to appear in testing/recognition.
摘要:
A phrase-based translation system and method includes a statistically integrated phrase lattice (SIPL) (H) which represents an entire translational model. An input (I) is translated by determining a best path through an entire lattice (S) by performing an efficient composition operation between the input and the SIPL. The efficient composition operation is performed by a multiple level search where each operand in the efficient composition operation represents a different search level.
摘要:
A system and method for speech translation includes a bridge module connected between a first component and a second component. The bridge module includes a transformation model configured to receive an original hypothesis output from a first component. The transformation model has one or more transformation features configured to transform the original hypothesis into a new hypothesis that is more easily translated by the second component.
摘要:
N sets of feature vectors are generated from a set of observation vectors which are indicative of a pattern which it is desired to recognize. At least one of the sets of feature vectors is different than at least one other of the sets of feature vectors, and is preselected for purposes of containing at least some complimentary information with regard to the at least one other set of feature vectors. The N sets of feature vectors are combined in a manner to obtain an optimized set of feature vectors which best represents the pattern. The combination is performed via one of a weighted likelihood combination scheme and a rank-based state-selection scheme; preferably, it is done in accordance with an equation set forth herein. In one aspect, a weighted likelihood combination can be employed, while in another aspect, rank-based state selection can be employed. An apparatus suitable for performing the method is described, and implementation in a computer program product is also contemplated. The invention is applicable to any type of pattern recognition problem where robustness is important, such as, for example, recognition of speech, handwriting or optical characters under challenging conditions.