Abstract:
A speech recognition apparatus and method. The speech recognition apparatus includes one or more processors configured to reflect a final recognition result for a previous audio signal in a language model, generate a first recognition result of an audio signal, in a first linguistic recognition unit, by using an acoustic model, generate a second recognition result of the audio signal, in a second linguistic recognition unit, by using the language model reflecting the final recognition result for the previous audio signal, and generate a final recognition result for the audio signal in the second linguistic recognition unit based on the first recognition result and the second recognition result. The first linguistic recognition unit may be a same or different linguistic unit type as the second linguistic recognition unit.
Abstract:
Described are an apparatus and method for generating to generate an acoustic model. The apparatus and method include a processor a processor configured to calculate a noise representation that represents noise data by using a noise model, and generate the acoustic model through training using training noisy speech data, which comprises speech data and the noise data, a string of phonemes corresponding to the speech data, and the noise representation.
Abstract:
A speech recognition apparatus includes a probability calculator configured to calculate phoneme probabilities of an audio signal using an acoustic model; a candidate set extractor configured to extract a candidate set from a recognition target list; and a result returner configured to return a recognition result of the audio signal based on the calculated phoneme probabilities and the extracted candidate set.
Abstract:
A speech recognition apparatus includes a probability calculator configured to calculate phoneme probabilities of an audio signal using an acoustic model; a candidate set extractor configured to extract a candidate set from a recognition target list; and a result returner configured to return a recognition result of the audio signal based on the calculated phoneme probabilities and the extracted candidate set.
Abstract:
Provided are a method and an apparatus for speech recognition, and a method and an apparatus for training transformation parameter. A speech recognition apparatus includes an acoustic score calculator configured to use an acoustic model to calculate an acoustic score of a speech input, an acoustic score transformer configured to transform the calculated acoustic score into an acoustic score corresponding to standard pronunciation by using a transformation parameter, and a decoder configured to decode the transformed acoustic score to output a recognition result of the speech input.
Abstract:
A neural network training apparatus includes a primary trainer configured to perform a primary training of a neural network model based on clean training data and target data corresponding to the clean training data; and a secondary trainer configured to perform a secondary training of the neural network model on which the primary training has been performed based on noisy training data and an output probability distribution of an output class for the clean training data calculated during the primary training of the neural network model.
Abstract:
A speech recognition apparatus and method. The speech recognition apparatus includes a first recognizer configured to generate a first recognition result of an audio signal, in a first linguistic recognition unit, by using an acoustic model, a second recognizer configured to generate a second recognition result of the audio signal, in a second linguistic recognition unit, by using a language model, and a combiner configured to combine the first recognition result and the second recognition result to generate a final recognition result in the second linguistic recognition unit and to reflect the final recognition result in the language model. The first linguistic recognition unit may be a same linguistic unit type as the second linguistic recognition unit. The first recognizer and the second recognizer are configured in a same neural network and simultaneously/collectively trained in the neural network using audio training data provided to the first recognizer.
Abstract:
A method and apparatus for speech recognition and for generation of speech recognition engine, and a speech recognition engine are provided. The method of speech recognition involves receiving a speech input, transmitting the speech input to a speech recognition engine, and receiving a speech recognition result from the speech recognition engine, in which the speech recognition engine obtains a phoneme sequence from the speech input and provides the speech recognition result based on a phonetic distance of the phoneme sequence.