Abstract:
The Zero User Interface (UI)-based automatic speech translation system and method can solve problems such as the procedural inconvenience of inputting speech signals and the malfunction of speech recognition due to crosstalk when users who speak difference languages have a face-to-face conversation. The system includes an automatic speech translation server, speaker terminals and a counterpart terminal. The automatic speech translation server selects a speech signal of a speaker among multiple speech signals received from speaker terminals connected to an automatic speech translation service and transmits a result of translating the speech signal of the speaker into a target language to a counterpart terminal.
Abstract:
Provided are an apparatus and method for selecting a speaker by using smart glasses. The apparatus includes a camera configured to capture a front angle video of a user and track guest interpretation interlocutors in the captured video, smart glasses configured to display a virtual space map image including the guest interpretation interlocutors tracked through the camera, a gaze-tracking camera configured to select a target person for interpretation by tracking a gaze of the user so that a guest interpretation interlocutor displayed in the video may be selected, and an interpretation target processor configured to provide an interpretation service in connection with the target person selected through the gaze-tracking camera.
Abstract:
A voice signal processing apparatus includes: an input unit which receives a voice signal of a user; a detecting unit which detects an auxiliary signal; and a signal processing unit which transmits the voice signal to an external terminal in a first operation mode and transmits the voice signal and the auxiliary signal to the external terminal using the same or different protocols in a second operation mode.
Abstract:
A speech recognition method capable of automatic generation of phones according to the present invention includes: unsupervisedly learning a feature vector of speech data; generating a phone set by clustering acoustic features selected based on an unsupervised learning result; allocating a sequence of phones to the speech data on the basis of the generated phone set; and generating an acoustic model on the basis of the sequence of phones and the speech data to which the sequence of phones is allocated.
Abstract:
Disclosed is an apparatus for speech recognition and automatic translation operated in a PC or a mobile device. The apparatus for speech recognition according to the present invention includes a display unit that displays a screen for selecting a domain as a unit for a speech recognition region previously sorted for speech recognition to a user; a user input unit that receives a selection of a domain from the user; and a communication unit that transmits the user selection information for the domain. According to the present invention, the apparatus for speech recognition using an intuitive and simple user interface is provided to a user to enable the user to easily select/correct a designation domain of a speech recognition system and improve accuracy and performance of speech recognition and automatic translation by the designated system for speech recognition.
Abstract:
The present invention relates to a voice recognition system for replacing a specific domain, a mobile device, and a method thereof, and more particularly, to a technology that divides a search space for voice recognition into a general domain search space and a specific domain search space.A voice recognition system according to an exemplary embodiment of the present invention includes: a mobile terminal receiving a voice recognition target word from a user; and a voice recognition server dividing a search space for voice recognition into a general domain search space and a specific domain search space and storing the spaces, and performing voice recognition for the voice recognition target word through linkage of the general domain search space and the specific domain search space.