摘要:
The present invention relates to a speech interactive system and method. The system comprises a target information receiving module, an interactive mode setting and speech processing module, an interactive information update module, a decision module, and an output response module. It receives target information and sets corresponding target text sentence information. It also receives a user's speech signal, sets an interactive mode, decides the speech's target text sentence information, and generates an assessment for the target text sentence. Under the set interactive mode, the system updates the information in an interactive information recording table according to the assessment and a timing count. According to the interactive mode and the recorded information, an output mode for the target text sentence information is generated. According to the output mode and the recorded information, the response information is generated.
摘要:
The present invention relates to a speech interactive system and method. The system comprises a target information receiving module, an interactive mode setting and speech processing module, an interactive information update module, a decision module, and an output response module. It receives target information and sets corresponding target text sentence information. It also receives a user's speech signal, sets an interactive mode, decides the speech's target text sentence information, and generates an assessment for the target text sentence. Under the set interactive mode, the system updates the information in an interactive information recording table according to the assessment and a timing count. According to the interactive mode and the recorded information, an output mode for the target text sentence information is generated. According to the output mode and the recorded information, the response information is generated.
摘要:
A speech recognition system is used to receive a speech signal and output an output language word with respect to the speech signal. The speech recognition system has preset quantities for a first threshold, a second threshold, and a third threshold. The speech recognition system includes a first speech recognition device that is used to receive the speech signal and generate a first candidate language word and a first confidence measurement of the first candidate language word, according to the speech signal. A second speech recognition device is used to receive the speech signal and generate a second candidate language word and a second confidence measurement of the second candidate language word, according to the speech signal. A confidence measurement judging unit is used to output the language word, by comparing the first confidence measurement and the second confidence measurement to the above thresholds.
摘要:
A method and system for utterance verification is disclosed. It first extracts a sequence of feature vectors from speech signal. At least one candidate string is obtained after speech recognition. Then, speech signal is segmented into speech segments according to the verification-unit-specified structure of candidate string for making each speech segment corresponding to a verification unit. After calculating the verification feature vectors of speech segments, these verification feature vectors are sequentially used to generate verification scores of speech segments in verification process. This invention uses neural networks for calculating verification scores, where each neural network is a Multi-Layer Perceptron (MLP) developed for each verification unit. Verification score is obtained through using feed-forward process of MLP. Finally, utterance verification score is obtained by combining all verification scores of speech segments and is used to compare with a pre-defined threshold for the decision of acceptance or rejection of the candidate string.
摘要:
In speaker-independent speech recognition, between-speaker variability is one of the major resources of recognition errors. A speaker cluster model is used to manage recognition problems caused by between-speaker variability. In the training phase, the score function is used as a discriminative function. The parameters of at least two cluster-dependent models are adjusted through a discriminative training method to improve performance of the speech recognition.
摘要:
A method for generating candidate word strings in speech recognition is provided, which is based on the nodes in the word lattice to search candidate word strings. The associated maximum string score for each node is first determined. Next, all nodes are sorted based on the associated maximum string score to group the nodes with the same string score into the same node set. Then, the node sets with relative high string scores are selected to connect the nodes by their starting time frame and ending time frame, thereby generating the candidate word strings.
摘要:
Device and method of channel effect compensation for a telephone speech recognition system is disclosed. The telephone speech recognition system comprises a compensatory neutral network and a recognize. The compensatory neural network receives an input signal and compensates the input signal with a bias to generate an output signal. The compensatory neural network provides a plurality of first parameters to determine the bias. The recognizer is coupled to the compensatory neural network for classifying the output signal according to a plurality of second parameters in acoustic models to generate a recognition result and determine a recognition loss. The first parameters and second parameters are adjusted according to the recognition loss and an adjustment means during a training process.
摘要:
A language learning system including a storage module, a feature extraction module, and an assessment and diagnosis module is provided. The storage module stores training data and an assessment decision tree generated according to the training data. The feature extraction module extracts pronunciation features of a pronunciation given by a language learner. The assessment and diagnosis module identifies a diagnosis path corresponding to the pronunciation of the language learner in the assessment decision tree and outputs feedback information corresponding to the diagnosis path. Thereby, the language learning system can assess and provide feedback information regarding words, phrases or sentences pronounced by the language learner.
摘要:
Apparatus, method and system for generating a threshold for utterance verification are introduced herein. When a processing object is determined, a recommendation threshold is generated according to an expected utterance verification result. In addition, extra collection of corpuses or training models is not necessary for the utterance verification introduced here. The processing unit can be a recognition object or an utterance verification object. In the apparatus, method and system for generating a threshold for utterance verification, at least one of the processing objects is received and then a speech unit sequence is generated therefrom. One or more values corresponding to each of the speech unit of the speech unit sequence are obtained accordingly, and then a recommendation threshold is generated based on an expected utterance verification result.
摘要:
A system and method for detecting the recognizability of input speech signal is provided. It is designed in the pre-stage of speech recognition or a dialog system. The invention detects the user's environmental condition and verifies if the input speech signal can be recognized. It mainly comprises an environment parameter generator, a signal recognition verifier, and a strategy response processor. Through the use of the invention in the pre-stage of speech recognition or a dialog system, it can precisely verify the recognizability of the input speech signal and receives the input speech signals of high recognition probability in a noisy environment. This reduces the impact caused by the receiving the input speech signals of low recognition probability. This invention thus increases the recognition probability for a recognizer.