Abstract:
A feature compensation apparatus includes a feature extractor configured to extract corrupt speech features from a corrupt speech signal with additive noise that consists of two or more frames; a noise estimator configured to estimate noise features based on the extracted corrupt speech features and compensated speech features; a probability calculator configured to calculate a correlation between adjacent frames of the corrupt speech signal; and a speech feature compensator configured to generate compensated speech features by eliminating noise features of the extracted corrupt speech features while taking into consideration the correlation between adjacent frames of the corrupt speech signal and the estimated noise features, and to transmit the generated compensated speech features to the noise estimator.
Abstract:
Provided are a signal processing algorithm-integrated deep neural network (DNN)-based speech recognition apparatus and a learning method thereof. A model parameter learning method in a deep neural network (DNN)-based speech recognition apparatus implementable by a computer includes converting a signal processing algorithm for extracting a feature parameter from a speech input signal of a time domain into signal processing deep neural network (DNN), fusing the signal processing DNN and a classification DNN, and learning a model parameter in a deep learning model in which the signal processing DNN and the classification DNN are fused.
Abstract:
Disclosed is apparatus and system for user interface. The apparatus for user interface comprises a body unit including a groove which is corresponding to a structure of an oral cavity and operable to be mounted on upper part of the oral cavity; a user input unit receiving a signal from the user's tongue in a part of the body unit; a communication unit transmitting the signal received from the user input unit; and a charging unit supplying an electrical energy generated from vibration or pressure caused by movement of the user's tongue.
Abstract:
A method and apparatus for estimating a user's requirement through a neural network which are capable of reading and writing a working memory and for providing fashion coordination knowledge appropriate for the requirement through the neural network using a long-term memory, by using the neural network using an explicit memory, in order to accurately provide the fashion coordination knowledge. The apparatus includes a language embedding unit for embedding a user's question and a previously created answer to acquire a digitized embedding vector; a fashion coordination knowledge creation unit for creating fashion coordination through the neural network having the explicit memory by using the embedding vector as an input; and a dialog creation unit for creating dialog content for configuring the fashion coordination through the neural network having the explicit memory by using the fashion coordination knowledge and the embedding vector an input.
Abstract:
Provided are sentence embedding method and apparatus based on subword embedding and skip-thoughts. To integrate skip-thought sentence embedding learning methodology with a subword embedding technique, a skip-thought sentence embedding learning method based on subword embedding and methodology for simultaneously learning subword embedding learning and skip-thought sentence embedding learning, that is, multitask learning methodology, are provided as methodology for applying intra-sentence contextual information to subword embedding in the case of subword embedding learning. This makes it possible to apply a sentence embedding approach to agglutinative languages such as Korean in a bag-of-words form. Also, skip-thought sentence embedding learning methodology is integrated with a subword embedding technique such that intra-sentence contextual information can be used in the case of subword embedding learning. A proposed model minimizes additional training parameters based on sentence embedding such that most training results may be accumulated in a subword embedding parameter.
Abstract:
The present invention relates to a device of simultaneous interpretation based on real-time extraction of an interpretation unit, the device including a voice recognition module configured to recognize voice units as sentence units or translation units from vocalized speech that is input in real time, a real-time interpretation unit extraction module configured to form one or more of the voice units into an interpretation unit, and a real-time interpretation module configured to perform an interpretation task for each interpretation unit formed by the real-time interpretation unit extraction module.
Abstract:
An incremental self-learning based dialogue apparatus for dialogue knowledge includes a dialogue processing unit configured to determine a intention of a user utterance by using a knowledge base and perform processing or a response suitable for the user intention, a dialogue establishment unit configured to automatically learn a user intention stored in a intention annotated learning corpus, store information about the learned user intention in the knowledge base, and edit and manage the knowledge base and the intention annotated learning corpus, and a self-knowledge augmentation unit configured to store a log of a dialogue performed by the dialogue processing unit, detect and classify an error in the stored dialogue log, automatically tag a user intention for the detected and classified error, and store the tagged user intention in the intention annotated learning corpus.
Abstract:
Provided is an apparatus for large vocabulary continuous speech recognition (LVCSR) based on a context-dependent deep neural network hidden Markov model (CD-DNN-HMM) algorithm. The apparatus may include an extractor configured to extract acoustic model-state level information corresponding to an input speech signal from a training data model set using at least one of a first feature vector based on a gammatone filterbank signal analysis algorithm and a second feature vector based on a bottleneck algorithm, and a speech recognizer configured to provide a result of recognizing the input speech signal based on the extracted acoustic model-state level information.
Abstract:
Provided are an apparatus and method for providing a personal assistant service based on automatic translation. The apparatus for providing a personal assistant service based on automatic translation includes an input section configured to receive a command of a user, a memory in which a program for providing a personal assistant service according to the command of the user is stored, and a processor configured to execute the program. The processor updates at least one of a speech recognition model, an automatic interpretation model, and an automatic translation model on the basis of an intention of the command of the user using a recognition result of the command of the user and provides the personal assistant service on the basis of an automatic translation call.
Abstract:
Provided are end-to-end method and system for grading foreign language fluency, in which a multi-step intermediate process of grading foreign language fluency in the related art is omitted. The method provides an end-to-end foreign language fluency grading method of grading a foreign language fluency of a non-native speaker from a non-native raw speech signal, and includes inputting the raw speech to a convolution neural network (CNN), training a filter coefficient of the CNN based on a fluency grading score calculated by a human rater for the raw signal so as to generate a foreign language fluency grading model, and grading foreign language fluency for a non-native speech signal newly input to the trained CNN by using the foreign language fluency grading model to output a grading result.