摘要:
A speech understanding apparatus includes: a speech recognition unit for recognizing an input speech to produce a speech recognition result; a sentence analysis unit for performing morpheme analysis on a sentence corresponding to the speech recognition result, extracting additional information, and performing syntax analysis; a hierarchy describing unit for describing hierarchy of the sentence; a class transformation unit for performing class transformation on the sentence; a semantic representation determination unit for marking optional expressions for the sentence, deleting meaningless expressions and the additional information, converting the sentence into its base form, and deleting morphemic tags or symbols to determine a semantic representation; a semantic representation retrieval unit for retrieving the determined semantic representation from an example-based semantic representation pattern database; and a retrieval result processing unit for selectively producing a retrieved semantic representation.
摘要:
Disclosed is a method of generating a search network for voice recognition, the method including: generating a pronunciation transduction weighted finite state transducer by implementing a pronunciation transduction rule representing a phenomenon of pronunciation transduction between recognition units as a weighted finite state transducer; and composing the pronunciation transduction weighted finite state transducer and one or more weighted finite state transducers.
摘要:
Disclosed herein is an apparatus and method for creating an acoustic model. The apparatus includes a binary tree creation unit, an information creation unit, and a binary tree reduction unit. The binary tree creation unit creates a binary tree by repeatedly merging a plurality of Gaussian components for each Hidden Markov Model (HMM) state of an acoustic model based on a distance measure reflecting a variation in likelihood score. The information creation unit creates information about information about the largest size of the acoustic model in accordance with a platform including a speech recognizer. The binary tree reduction unit reduces the binary tree in accordance with the information about the largest size of the acoustic model.
摘要:
An apparatus for a speech recognition based on source separation and identification includes: a sound source separator for separating mixed signals, which are input to two or more microphones, into sound source signals by using independent component analysis (ICA), and estimating direction information of the separated sound source signals; and a speech recognizer for calculating normalized log likelihood probabilities of the separated sound source signals. The apparatus further includes a speech signal identifier identifying a sound source corresponding to a user's speech signal by using both of the estimated direction information and the reliability information based on the normalized log likelihood probabilities.
摘要:
Provided are an automatic speech translation system and a method for obtaining accurate translation performance with a simple structure. Because input and output sentences are written in different languages, automatic speech translation requires techniques for processing different languages. Repetition of text processing like morpheme analysis or sentence parsing in conventional automatic speech translation can complicate the overall translation process. Meanwhile, although input and output sentences are written in different languages, they have to have the same meaning and a corresponding sentence form and words. Accordingly, the corresponding words and sentence forms of the two languages can be expressed with a simple structure and utilized in the automatic speech translation process, thereby maintaining consistency during the process and avoiding unnecessary process repetition, which reduces errors and improves performance.
摘要:
Provided are an apparatus and method for recognizing speech, in which reliability with respect to phoneme-recognized phoneme sequences is calculated and performance of speech recognition is enhanced using the calculated results. The method of recognizing speech includes the steps of: determining a boundary between phonemes included in character sequences that are phonetically input to detect each phoneme interval; calculating reliability according to a probability that a phoneme indicated by the detected phoneme interval corresponds to a phoneme included in a predefined phoneme model; calculating a phoneme alignment cost with respect to the character sequences based on the calculated reliability and a pre-trained and stored phoneme recognition probability distribution; and performing phoneme alignment based on the calculated phoneme alignment cost to perform speech recognition on the input character sequences. As a result, reliability with respect to the phoneme-recognized phoneme sequences can be calculated, and the performance of speech recognition can be enhanced using the calculated results.
摘要:
Provided are an automatic speech translation system and a method for obtaining accurate translation performance with a simple structure. Because input and output sentences are written in different languages, automatic speech translation requires techniques for processing different languages. Repetition of text processing like morpheme analysis or sentence parsing in conventional automatic speech translation can complicate the overall translation process. Meanwhile, although input and output sentences are written in different languages, they have to have the same meaning and a corresponding sentence form and words. Accordingly, the corresponding words and sentence forms of the two languages can be expressed with a simple structure and utilized in the automatic speech translation process, thereby maintaining consistency during the process and avoiding unnecessary process repetition, which reduces errors and improves performance.
摘要:
Provided are an apparatus and method for recognizing continuous speech using search space restriction based on phoneme recognition. In the apparatus and method, a search space can be primarily reduced by restricting connection words to be shifted at a boundary between words based on the phoneme recognition result. In addition, the search space can be secondarily reduced by rapidly calculating a degree of similarity between the connection word to be shifted and the phoneme recognition result using a phoneme code and shifting the corresponding phonemes to only connection words having degrees of similarity equal to or higher than a predetermined reference value. Therefore, the speed and performance of the speech recognition process can be improved in various speech recognition services.
摘要:
Provided are an apparatus and method for recognizing continuous speech using search space restriction based on phoneme recognition. In the apparatus and method, a search space can be primarily reduced by restricting connection words to be shifted at a boundary between words based on the phoneme recognition result. In addition, the search space can be secondarily reduced by rapidly calculating a degree of similarity between the connection word to be shifted and the phoneme recognition result using a phoneme code and shifting the corresponding phonemes to only connection words having degrees of similarity equal to or higher than a predetermined reference value. Therefore, the speed and performance of the speech recognition process can be improved in various speech recognition services.
摘要:
A speech understanding apparatus includes: a speech recognition unit for recognizing an input speech to produce a speech recognition result; a sentence analysis unit for performing morpheme analysis on a sentence corresponding to the speech recognition result, extracting additional information, and performing syntax analysis; a hierarchy describing unit for describing hierarchy of the sentence; a class transformation unit for performing class transformation on the sentence; a semantic representation determination unit for marking optional expressions for the sentence, deleting meaningless expressions and the additional information, converting the sentence into its base form, and deleting morphemic tags or symbols to determine a semantic representation; a semantic representation retrieval unit for retrieving the determined semantic representation from an example-based semantic representation pattern database; and a retrieval result processing unit for selectively producing a retrieved semantic representation.