Abstract:
Voice processing methods and systems are provided. An utterance is received. The utterance is compared with teaching materials according to at least one matching algorithm to obtain a plurality of matching values corresponding to a plurality of voice units of the utterance. Respective voice units are scored in at least one first scoring item according to the matching values and a personified voice scoring algorithm. The personified voice scoring algorithm is generated according to training utterances corresponding to at least one training sentence in a phonetic-balanced sentence set of a plurality of learners and at least one real teacher, and scores corresponding to the respective voice units of the training utterances of the learners in the first scoring item provided by the real teacher.
Abstract:
The disclosure provides a method for displaying words. In the method, a speech signal is received. A pitch contour and an energy contour of the speech signal are extracted. Speech recognition is performed on the speech signal to recognize a plurality of words corresponding to the speech signal and determine time alignment information of each of the plurality of words. At least one display parameter of each of the plurality of words is determined according to the pitch contour, the energy contour and the time alignment information of each of the plurality of words. Thus, the plurality of words is integrated into a sentence according to the at least one display parameter of each of the plurality of words. Then, the sentence is displayed on at least one display device.
Abstract:
The disclosure provides a method for displaying words. In the method, a speech signal is received. A pitch contour and an energy contour of the speech signal are extracted. Speech recognition is performed on the speech signal to recognize a plurality of words corresponding to the speech signal and determine time alignment information of each of the plurality of words. At least one display parameter of each of the plurality of words is determined according to the pitch contour, the energy contour and the time alignment information of each of the plurality of words. Thus, the plurality of words is integrated into a sentence according to the at least one display parameter of each of the plurality of words. Then, the sentence is displayed on at least one display device.
Abstract:
Voice processing methods and systems are provided. An utterance is received. The utterance is compared with teaching materials according to at least one matching algorithm to obtain a plurality of matching values corresponding to a plurality of voice units of the utterance. Respective voice units are scored in at least one first scoring item according to the matching values and a personified voice scoring algorithm. The personified voice scoring algorithm is generated according to training utterances corresponding to at least one training sentence in a phonetic-balanced sentence set of a plurality of learners and at least one real teacher, and scores corresponding to the respective voice units of the training utterances of the learners in the first scoring item provided by the real teacher.