摘要:
A smart watch in accordance with an embodiment of the present invention comprises: a first smart member configured to receive a voice signal sent from a mobile terminal, transform the input voice of a user to a voice signal, and send the voice signal to the mobile terminal while in talk mode; and a second smart member configured to input a control command about the talk mode into the first smart member, and transform the voice signal to voice and output the voice.
摘要:
An apparatus for extracting features for speech recognition in accordance with the present invention includes: a frame forming portion configured to separate input speech signals in frame units having a prescribed size; a static feature extracting portion configured to extract a static feature vector for each frame of the speech signals; a dynamic feature extracting portion configured to extract a dynamic feature vector representing a temporal variance of the extracted static feature vector by use of a basis function or a basis vector; and a feature vector combining portion configured to combine the extracted static feature vector with the extracted dynamic feature vector to configure a feature vector stream.
摘要:
An apparatus and method for verifying an utterance based on multi-event detection information in a natural language speech recognition system. The apparatus includes a noise processor configured to process noise of an input speech signal, a feature extractor configured to extract features of speech data obtained through the noise processing, an event detector configured to detect events of the plurality of speech features occurring in the speech data using the noise-processed data and data of the extracted features, a decoder configured to perform speech recognition using a plurality of preset speech recognition models for the extracted feature data, and an utterance verifier configured to calculate confidence measurement values in units of words and sentences using information on the plurality of events detected by the event detector and a preset utterance verification model and perform utterance verification according to the calculated confidence measurement values.
摘要:
Provided is an apparatus for large vocabulary continuous speech recognition (LVCSR) based on a context-dependent deep neural network hidden Markov model (CD-DNN-HMM) algorithm. The apparatus may include an extractor configured to extract acoustic model-state level information corresponding to an input speech signal from a training data model set using at least one of a first feature vector based on a gammatone filterbank signal analysis algorithm and a second feature vector based on a bottleneck algorithm, and a speech recognizer configured to provide a result of recognizing the input speech signal based on the extracted acoustic model-state level information.
摘要:
Provided are an apparatus and method for linearly approximating a deep neural network (DNN) model which is a non-linear function. In general, a DNN model shows good performance in generation or classification tasks. However, the DNN fundamentally has non-linear characteristics, and therefore it is difficult to interpret how a result from inputs given to a black box model has been derived. To solve this problem, linear approximation of a DNN is proposed. The method for linearly approximating a DNN model includes 1) converting a neuron constituting a DNN into a polynomial, and 2) classifying the obtained polynomial as a polynomial of input signals and a polynomial of weights.
摘要:
Provided are an apparatus and method for linearly approximating a deep neural network (DNN) model which is a non-linear function. In general, a DNN model shows good performance in generation or classification tasks. However, the DNN fundamentally has non-linear characteristics, and therefore it is difficult to interpret how a result from inputs given to a black box model has been derived. To solve this problem, linear approximation of a DNN is proposed. The method for linearly approximating a DNN model includes 1) converting a neuron constituting a DNN into a polynomial, and 2) classifying the obtained polynomial as a polynomial of input signals and a polynomial of weights.
摘要:
Provided are a signal processing algorithm-integrated deep neural network (DNN)-based speech recognition apparatus and a learning method thereof. A model parameter learning method in a deep neural network (DNN)-based speech recognition apparatus implementable by a computer includes converting a signal processing algorithm for extracting a feature parameter from a speech input signal of a time domain into signal processing deep neural network (DNN), fusing the signal processing DNN and a classification DNN, and learning a model parameter in a deep learning model in which the signal processing DNN and the classification DNN are fused.
摘要:
An apparatus for detecting an end point using decoding information includes: an end point detector configured to extract a speech signal from an acoustic signal received from outside and detect end points of the speech signal; a decoder configured to decode the speech signal; and an end point detector configured to extract reference information serving as a standard of actual end point discrimination from decoding information generated during the decoding process of the decoder, and discriminate an actual end point among the end points detected by the end point detector based on the extracted reference information.