Abstract:
Provided is a method for interpretation and translation accomplished by an interpretation and translation apparatus of a user through interfacing with an interpretation and translation apparatus of the other party. The method includes: automatically setting a translation target language which enables to communicate with the other party based on a message from the interpretation and translation apparatus of the other party by using a communication connection in a network; receiving input information of a use language of the user; calling a translator corresponding to the translation target language to transmit a result obtained by translating the input information into the translation target language to the interpretation and translation apparatus of the other party; and outputting received data from the interpretation and translation apparatus of the other party or outputting the result obtained by translating the received data into the use language of the user by using the translator.
Abstract:
A speech recognition apparatus and method are provided, the method including converting an input signal to acoustic model data, dividing the acoustic model data into a speech model group and a non-speech model group and calculating a first maximum likelihood corresponding to the speech model group and a second maximum likelihood corresponding to the non-speech model group, detecting a speech based on a likelihood ratio (LR) between the first maximum likelihood and the second maximum likelihood, obtaining utterance stop information based on output data of a decoder and dividing the input signal into a plurality of speech intervals based on the utterance stop information, calculating a confidence score of each of the plurality of speech intervals based on information on a prior probability distribution of the acoustic model data, and removing a speech interval having the confidence score lower than a threshold.
Abstract:
Provided are an automatic interpretation system and method for generating a synthetic sound having characteristics similar to those of an original speaker's voice. The automatic interpretation system for generating a synthetic sound having characteristics similar to those of an original speaker's voice includes a speech recognition module configured to generate text data by performing speech recognition for an original speech signal of an original speaker and extract at least one piece of characteristic information among pitch information, vocal intensity information, speech speed information, and vocal tract characteristic information of the original speech, an automatic translation module configured to generate a synthesis-target translation by translating the text data, and a speech synthesis module configured to generate a synthetic sound of the synthesis-target translation.
Abstract:
Disclosed is an apparatus for speech recognition and automatic translation operated in a PC or a mobile device. The apparatus for speech recognition according to the present invention includes a display unit that displays a screen for selecting a domain as a unit for a speech recognition region previously sorted for speech recognition to a user; a user input unit that receives a selection of a domain from the user; and a communication unit that transmits the user selection information for the domain. According to the present invention, the apparatus for speech recognition using an intuitive and simple user interface is provided to a user to enable the user to easily select/correct a designation domain of a speech recognition system and improve accuracy and performance of speech recognition and automatic translation by the designated system for speech recognition.
Abstract:
Provided a method performed by an automatic interpretation server based on a zero user interface (UI), which communicates with a plurality of terminal devices having a microphone function, a speaker function, a communication function, and a wearable function. The method includes connecting terminal devices disposed within a designated automatic interpretation zone, receiving a voice signal of a first user from a first terminal device among the terminal devices within the automatic interpretation zone, matching a plurality of users placed within a speech-receivable distance of the first terminal device, and performing automatic interpretation on the voice signal and transmitting results of the automatic interpretation to a second terminal device of at least one second user corresponding to a result of the matching.
Abstract:
Provided are an apparatus and method for providing a personal assistant service based on automatic translation. The apparatus for providing a personal assistant service based on automatic translation includes an input section configured to receive a command of a user, a memory in which a program for providing a personal assistant service according to the command of the user is stored, and a processor configured to execute the program. The processor updates at least one of a speech recognition model, an automatic interpretation model, and an automatic translation model on the basis of an intention of the command of the user using a recognition result of the command of the user and provides the personal assistant service on the basis of an automatic translation call.
Abstract:
An automatic translation device includes a communications module transmitting and receiving data to and from an ear-set device including a speaker, a first microphone, and a second microphone, a memory storing a program generating a result of translation using a dual-channel audio signal, and a processor executing the program stored in the memory. When the program is executed, the processor compares a first audio signal including a voice signal of a user, received using the first microphone, with a second audio signal including a noise signal and the voice signal of the user, received using the second microphone, and entirely or selectively extracting the voice signal of the user from the first and second audio signals, based on a result of the comparison, to perform automatic translation.
Abstract:
A voice signal processing apparatus includes: an input unit which receives a voice signal of a user; a detecting unit which detects an auxiliary signal, and a signal processing unit which transmits the voice signal to an external terminal in a first operation mode and transmits the voice signal and the auxiliary signal to the external terminal using the same or different protocols in a second operation mode.
Abstract:
Provided is a method of performing automatic interpretation based on speaker separation by a user terminal, the method including: receiving a first speech signal including at least one of a user speech of a user and a user surrounding speech around the user from an automatic interpretation service providing terminal, separating the first speech signal into speaker-specific speech signals, performing interpretation on the speaker-specific speech signals in a language selected by the user on the basis of an interpretation mode, and providing a second speech signal generated as a result of the interpretation to at least one of a counterpart terminal and the automatic interpretation service providing terminal according to the interpretation mode.
Abstract:
Disclosed are a Zero User Interface (UI)-based automatic speech translation system and method. The system and method can solve problems such as the procedural inconvenience of inputting speech signals and the malfunction of speech recognition due to crosstalk when users who speak difference languages have a face-to-face conversation. The system includes an automatic speech translation server configured to select a speech signal of a speaker from among multiple speech signals received from user terminals connected to an automatic speech translation service and configured to transmit a result of translating the speech signal of the speaker into a target language, a speaker terminal configured to receive the speech signal of the speaker and transmit the speech signal of the speaker to the automatic speech translation server, and a counterpart terminal configured to output the result of the translation in a form of text or voice in the target language.