摘要:
A speech recognition method and apparatus and a navigation system having the speech recognition apparatus are provided. The speech recognition method includes capturing speech as speech signal and extracting features from the speech signal, selecting candidates of a subword among subwords of the word based on the extracted features and displaying the candidate subwords for the subword, selecting candidates of a next subword following the subword based on the selected candidates of the subword and displaying the candidates of the next subword, and determining whether the user has selected one of the candidates of the next subword and, if not, selecting candidates of subwords following the next subword based on the series of subwords that have been previously selected by the user and displaying the selected candidates of the next subword.
摘要:
An audio apparatus including a decorrelator for generating decorrelated signals by applying a phase shifting value adjusted based on a correlation difference between audio signals included in a multi-channel signal to the audio signals; and a speaker set including at least two speakers for outputting acoustic signals corresponding to the decorrelated signals.
摘要:
Provided is a method and apparatus for transforming a speech feature vector. The method includes extracting a feature vector required for speech recognition from a speech signal and transforming the extracted feature vector using an auto-associative neural network (AANN).
摘要:
Provided are a multi-stage speech recognition apparatus and method. The multi-stage speech recognition apparatus includes a first speech recognition unit performing initial speech recognition on a feature vector, which is extracted from an input speech signal, and generating a plurality of candidate words; and a second speech recognition unit rescoring the candidate words, which are provided by the first speech recognition unit, using a temporal posterior feature vector extracted from the speech signal.
摘要:
An apparatus and method for detecting a named-entity. The apparatus includes a candidate-named-entity extraction module that detects a candidate-named-entity based on an initial learning example and feature information regarding morphemes constituting an inputted sentence, the candidate-named-entity extraction module providing a tagged sentence including the detected candidate-named-entity; a storage module that stores information regarding a named-entity dictionary and a rule; and a learning-example-regeneration module for finally determining whether the candidate-named-entity included in the provided sentence is a valid named-entity, based on the named-entity dictionary and the rule, the learning-example-regeneration module providing the sentence as a learning example, based on a determination result, so that a probability of candidate-named-entity detection is gradually updated.
摘要:
A user adaptive speech recognition method and apparatus is disclosed that controls user confirmation of a recognition candidate using a new threshold value adapted to a user. The user adaptive speech recognition method includes calculating a confidence score of a recognition candidate according to the result of speech recognition, setting a new threshold value adapted to the user based on a result of user confirmation of the recognition candidate and the confidence score of the recognition candidate, and outputting a corresponding recognition candidate as a result of the speech recognition if the calculated confidence score is higher than the new threshold value. Thus, the need for user confirmation of the result of speech recognition is reduced and the probability of speech recognition success is increased.
摘要:
A method and apparatus for improving the performance of voice recognition in a mobile device are provided. The method of recognizing a voice includes: monitoring the usage pattern of a user of a device for inputting a voice; selecting predetermined words from among words stored in the device based on the result of monitoring, and storing the selected words; and recognizing a voice based on an acoustic model and predetermined words. In this way, a voice can be recognized by using prediction of whom the user mainly makes a call to. Also, by automatically modeling the device usage pattern of the user and applying the pattern to vocabulary for voice recognition based on probabilities, the performance of voice recognition, as actually felt by the user, can be enhanced.
摘要:
A speech synthesis system for controlling a discontinuous distortion that occurs at the transition portion between concatenated phonemes which are speech units of a synthesized speech using a smoothing technique, comprising: a discontinuous distortion processing means adapted to predict a discontinuity at the transition portion between concatenated samples of phonemes used for a speech synthesis through a predetermined learning process, and control a discontinuity at the transition portion between the concatenated phonemes of the synthesized speech in such a fashion that it is smoothed adaptively to correspond to a degree of the predicted discontinuity. The smoothing filter smoothes the synthesized speech so that the discontinuity degree of synthesized speech follows the predicted discontinuity degree according to the filter coefficient (a) changed adaptively to correspond to a ratio of the predicted discontinuity degree to the real discontinuity degree. That is, since a discontinuity at a transition portion between concatenated phonemes of the synthesized speech (IN) is adaptively smoothed to follow that which occurs in the actually spoken sound, the synthesized speech (IN) can be approximated more closely to a real human voice.
摘要:
An apparatus and method for detecting a voice activity period. The apparatus for detecting a voice activity period includes a domain conversion module that converts an input signal into a frequency domain signal in the unit of a frame obtained by dividing the input signal at predetermined intervals, a subtracted-spectrum-generation module that generates a spectral subtraction signal which is obtained by subtracting a predetermined noise spectrum from the converted frequency domain signal, a modeling module that applies the spectral subtraction signal to a predetermined probability distribution model, and a speech-detection module that determines whether a speech signal is present in a current frame through a probability distribution calculated by the modeling module.
摘要:
A method and an apparatus for selecting a vocabulary closest to an input speech from among lexicons stored in memory, wherein a centroid lexicon representing lexicons belonging to a predetermined lexicon group is generated. Two lexicons, having a longest distance therebetween in the lexicon group, are selected using the centroid lexicon from the lexicon group, and a node indicating the lexicon group branches based on the two selected lexicons. A node having low group similarity is selected from among current terminal nodes, including branch nodes, and the above procedure is repeatedly performed on a lexicon group indicated by the selected node.