Abstract:
A knowledge increasing method includes calculating uncertainty of knowledge obtained from a neural network using an explicit memory, determining the insufficiency of the knowledge on the basis of the calculated uncertainty, obtaining additional data (learning data) for increasing insufficient knowledge, and training the neural network by using the additional data to autonomously increase knowledge.
Abstract:
An apparatus and method for verifying an utterance based on multi-event detection information in a natural language speech recognition system. The apparatus includes a noise processor configured to process noise of an input speech signal, a feature extractor configured to extract features of speech data obtained through the noise processing, an event detector configured to detect events of the plurality of speech features occurring in the speech data using the noise-processed data and data of the extracted features, a decoder configured to perform speech recognition using a plurality of preset speech recognition models for the extracted feature data, and an utterance verifier configured to calculate confidence measurement values in units of words and sentences using information on the plurality of events detected by the event detector and a preset utterance verification model and perform utterance verification according to the calculated confidence measurement values.
Abstract:
A feature compensation apparatus includes a feature extractor configured to extract corrupt speech features from a corrupt speech signal with additive noise that consists of two or more frames; a noise estimator configured to estimate noise features based on the extracted corrupt speech features and compensated speech features; a probability calculator configured to calculate a correlation between adjacent frames of the corrupt speech signal; and a speech feature compensator configured to generate compensated speech features by eliminating noise features of the extracted corrupt speech features while taking into consideration the correlation between adjacent frames of the corrupt speech signal and the estimated noise features, and to transmit the generated compensated speech features to the noise estimator.
Abstract:
Provided are a signal processing algorithm-integrated deep neural network (DNN)-based speech recognition apparatus and a learning method thereof. A model parameter learning method in a deep neural network (DNN)-based speech recognition apparatus implementable by a computer includes converting a signal processing algorithm for extracting a feature parameter from a speech input signal of a time domain into signal processing deep neural network (DNN), fusing the signal processing DNN and a classification DNN, and learning a model parameter in a deep learning model in which the signal processing DNN and the classification DNN are fused.
Abstract:
Provided are a neural network memory computing system and method. The neural network memory computing system includes a first processor configured to learn a sense-making process on the basis of sense-making multimodal training data stored in a database, receive multiple modalities, and output a sense-making result on the basis of results of the learning, and a second processor configured to generate a sense-making training set for the first processor to increase knowledge for learning a sense-making process and provide the generated sense-making training set to the first processor.
Abstract:
Provided are end-to-end method and system for grading foreign language fluency, in which a multi-step intermediate process of grading foreign language fluency in the related art is omitted. The method provides an end-to-end foreign language fluency grading method of grading a foreign language fluency of a non-native speaker from a non-native raw speech signal, and includes inputting the raw speech to a convolution neural network (CNN), training a filter coefficient of the CNN based on a fluency grading score calculated by a human rater for the raw signal so as to generate a foreign language fluency grading model, and grading foreign language fluency for a non-native speech signal newly input to the trained CNN by using the foreign language fluency grading model to output a grading result.
Abstract:
The present invention relates to a method and apparatus for improving spontaneous speech recognition performance. The present invention is directed to providing a method and apparatus for improving spontaneous speech recognition performance by extracting a phase feature as well as a magnitude feature of a voice signal transformed to the frequency domain, detecting a syllabic nucleus on the basis of a deep neural network using a multi-frame output, determining a speaking rate by dividing the number of syllabic nuclei by a voice section interval detected by a voice detector, calculating a length variation or an overlap factor according to the speaking rate, and performing cepstrum length normalization or time scale modification with a voice length appropriate for an acoustic model.
Abstract:
Provided are a method of automatically classifying a speaking rate and a speech recognition system using the method. The speech recognition system using automatic speaking rate classification includes a speech recognizer configured to extract word lattice information by performing speech recognition on an input speech signal, a speaking rate estimator configured to estimate word-specific speaking rates using the word lattice information, a speaking rate normalizer configured to normalize a word-specific speaking rate into a normal speaking rate when the word-specific speaking rate deviates from a preset range, and a rescoring section configured to rescore the speech signal whose speaking rate has been normalized.
Abstract:
Provided are a method and apparatus for online Bayesian few-shot learning. The present invention provides a method and apparatus for online Bayesian few-shot learning in which multi-domain-based online learning and few-shot learning are integrated when domains of tasks having data are sequentially given.
Abstract:
Provided is an apparatus and method for reducing the number of deep neural network model parameters, the apparatus including a memory in which a program for DNN model parameter reduction is stored, and a processor configured to execute the program, wherein the processor represents hidden layers of the model of the DNN using a full-rank decomposed matrix, uses training that is employed with a sparsity constraint for converting a diagonal matrix value to zero, and determines a rank of each of the hidden layers of the model of the DNN according to a degree of the sparsity constraint.