摘要:
In a conventional voice dialogue system, there is a case where it is difficult to perform a natural dialogue with the user. Therefore, we designed to perform speech recognition on the user's utterance, to control a dialogue with the user according to a scenario previously given, based on the speech recognition result to generate an answering sentence corresponding to the contents of the user's utterance as the occasion demands, and to perform voice synthesis processing to one sentence in the reproduced scenario or the generated answering sentence.
摘要:
An apparatus, method and program for performing a speech recognition process utilizing contextual information that comprises an estimation of the intention of an utterance of a user. The recognition process includes calculating a pre-score based on observed contextual information according intention models which correspond to a plurality of types of intention information and combining the pre-scoring results with acoustic and linguistic scores to obtain an improved recognition or comprehension of the intent of a user utterance.
摘要:
A plural number of letters or characters, inferred from the results of letter/character recognition of an image photographed by a CCD camera (20), a plural number of kana readings inferred from the letters or characters and the way of pronunciation corresponding to the kana readings are generated in an pronunciation information generating unit (150) and the plural readings obtained are matched to the pronunciation from the user acquired by a microphone (23) to specify one kana reading and the way of pronunciation (reading) from among the plural generated candidates.
摘要:
A preliminary word-selecting section selects one or more words following words which have been obtained in a word string serving as a candidate for a result of speech recognition; and a matching section calculates acoustic or linguistic scores for the selected words, and forms a word string serving as a candidate for a result of speech recognition according to the scores. A control section generates word-connection relationships between words in the word string serving as a candidate for a result of speech recognition, sends them to a word-connection-information storage section, and stores them in it. A re-evaluation section corrects the word-connection relationships stored in the word-connection-information storage section 16, and the control section determines a word string serving as the result of speech recognition according to the corrected word-connection relationships.
摘要:
In order to prevent degradation of speech recognition accuracy due to an unknown word, a dictionary database has stored therein a word dictionary in which are stored, in addition to words for the objects of speech recognition, suffixes, which are sound elements and a sound element sequence, which form the unknown word, for classifying the unknown word by the part of speech thereof. Based on such a word dictionary, a matching section connects the acoustic models of an sound model database, and calculates the score using the series of features output by a feature extraction section on the basis of the connected acoustic model. Then, the matching section selects a series of the words, which represents the speech recognition result, on the basis of the score.
摘要:
An extended-word selecting section calculates a score for a phoneme string formed of one more phonemes, corresponding to a user's speech, and searches a large-vocabulary-dictionary for a word having one or more phonemes equal to or similar to those of a phoneme string having a score equal to or higher than a predetermined value. A matching section calculates scores for the word searched for by the extended-word selecting section in addition to a word preliminary word-selecting section. A control section determines a word string as the result of recognition of the speech uttered by the user.
摘要:
A speech recognizing device for efficient processing while keeping a high speech recognizing performance. A matching unit (14) computes the score of a word preliminarily selected by a word preliminary selection unit (13) and determines candidates of the speech recognition result on the basis of the score. A control unit (11) creates a word connection relation between the words of a word sequence, which is a candidate of the speech recognition result and stores them in a word connection information storage unit (16). A revaluation unit (15) corrects the word connection relation serially, and the control unit ( 11) defines the speech recognition result on the basis of the word connection relation corrected. A word connection relation managing unit (21) limits the time corresponding to the boundary of a word expressed by the word connection relation, and a word connection relation managing unit (22) limits the starting time of the word preliminarily selected by the word preliminary selection unit (13). The speech recognizing device can be applied to an interactive system which responds to the speech recognition result.
摘要:
A speech recognition apparatus in which the accuracy in speech recognition is improved as the resource is prevented from increasing. Such a word which is probable as the result of the speech recognition is selected on the basis of an acoustic score and a linguistic score, while word selection is also performed on the basis of a measure different from the acoustic score, such as the number of phonemes being small, a part of speech being a pre-set one, inclusion in the past results of speech recognition or the linguistic score being not less than a pre-set value. The words so selected are subjected to matching processing.
摘要:
A system and method for an automatic set-up of speech recognition engines may include a speech recognizer configured to perform speech recognition procedures to identify input speech data according to one or more operating parameters. A merit manager may be utilized to automatically calculate merit values corresponding to the foregoing recognition procedures. These merit values may incorporate recognition accuracy information, recognition speed information, and a user-specified weighting factor that shifts the relative effect of the recognition accuracy information and the recognition speed information on the merit values. The merit manager may then automatically perform a merit value optimization procedure to select operating parameters that correspond to an optimal one of the merit values.
摘要:
The present invention provides a speech recognition apparatus having high speech recognition performance and capable of performing speech recognition in a highly efficient manner. A matching unit 14 calculates the scores of words selected by a preliminary word selector 13 and determines a candidate for a speech recognition result on the basis of the calculated scores. A control unit 11 produces word connection relationships among words included in a word series employed as a candidate for the speech recognition result and stores them into a word connection information storage unit 16. A reevaluation unit 15 corrects the word connection relationships one by one. On the basis of the corrected word connection relationships, the control unit 11 determines the speech recognition result. A word connection managing unit 21 limits times allowed for a boundary between words represented by the word connection relationships to be located thereat. A word connection managing unit 22 limits start times of words preliminarily selected by the preliminary word selector 13. The present invention can be applied to an interactive system that recognizes an input speech and responds to the speech recognition result.