Abstract:
Disclosed is a speech recognition method and apparatus, the method including two recognition processes, a first recognition process being performed using an acoustic model and a language model and a second recognition process being performed without distinguishing between the acoustic model and the language model in response to an accuracy of a result of the first recognition process not meeting a threshold. The apparatus including a processor configured to acquire a first text from a speech sequence using an acoustic model and a language model, determine whether an accuracy of the first text meets a threshold, and acquire a second text from the first text based on a parameter generated in acquiring the first text, in response to the accuracy of the first text being below the threshold.
Abstract:
A method and apparatus for speech recognition are disclosed. The speech recognition apparatus includes a processor configured to process a received speech signal, generate a word sequence based on a phoneme sequence generated from the speech signal, generate a syllable sequence corresponding to a word element among words comprised in the word sequence based on the phoneme sequence, and determine a text corresponding to a recognition result of the speech signal based on the word sequence and the syllable sequence.
Abstract:
A machine translation method and a machine translation apparatus using a neural network model are provided. The machine translation apparatus extracts information associated with a keyword from a source sentence, obtains a supplement sentence associated with the source sentence based on the extracted information associated with the keyword, acquires a first vector value from the source sentence and a second vector value from the supplement sentence using neural network model-based encoders, and outputs a target sentence corresponding to a translation of the source sentence based on any one or any combination of the first vector value and the second vector value using a neural network model-based decoder.
Abstract:
A method and apparatus for training a language model, include generating a first training feature vector sequence and a second training feature vector sequence from training data. The method is configured to perform forward estimation of a neural network based on the first training feature vector sequence, and perform backward estimation of the neural network based on the second training feature vector sequence. The method is further configured to train a language model based on a result of the forward estimation and a result of the backward estimation.
Abstract:
A multilevel speech recognition method and an apparatus performing the method are disclosed. The method includes receiving a first speech command from a user through a speech interface, and extracting a keyword from the first speech command. The method also includes providing a candidate application group of a category providing a service associated with the keyword, and processing a second speech command from the user associated with an application selected from the candidate application group.