-
公开(公告)号:US20230377558A1
公开(公告)日:2023-11-23
申请号:US18319946
申请日:2023-05-18
Inventor: Min kyu LEE , Sanghun KIM , Seung YUN , Jeonguk BANG , Namhyeong KIM
IPC: G10L13/027 , G06V40/16 , G06F40/58 , G10L15/00
CPC classification number: G10L13/027 , G06V40/161 , G06F40/58 , G10L15/005
Abstract: The present invention relates to an automatic interpretation method and system for converting only voice of a speaker into a target language. The present invention may significantly improve performance of automatic interpretation with a foreigner to be communicated with, even in a high-noise environment in which multiple speakers utter at the same time by utilizing voice and image information input to a smart device in a complex manner. In addition, the present invention may determine a situation based on text information and image information existing around a user, and reflect the situation information together with multimodal information to an interpretation engine in real time. In addition, the present invention may significantly improve user convenience of an automatic interpretation system by directly augmenting and displaying an interpreted sentence directly next to a speaker image or generating a synthesized sound by distinguishing the interpreted sentence from other speeches.
-
公开(公告)号:US20240212681A1
公开(公告)日:2024-06-27
申请号:US18498241
申请日:2023-10-31
Inventor: Min Kyu LEE , Seung Hi KIM , Sanghun KIM , Jeonguk BANG , Seung YUN
IPC: G10L15/22 , G06V40/16 , G10L13/02 , G10L17/00 , H04N23/611
CPC classification number: G10L15/22 , G06V40/172 , G10L13/02 , G10L17/00 , H04N23/611
Abstract: A voice recognition device having a barge-in function and a method thereof are proposed.
In an exemplary embodiment, there are disclosed an intelligent robot and a method for operating the intelligent robot, including an input unit for receiving a user's voice data, one or more processors, and an output unit for outputting a response generated on a basis of the user's voice data, wherein the processors generate the response corresponding to the users' voice data while maintaining a listening mode for identifying a dialogue partner by using the user's face image data and the user's voice data, and perform a speaking mode for control so as to perform an operation corresponding to the response.-
公开(公告)号:US20240221742A1
公开(公告)日:2024-07-04
申请号:US18488333
申请日:2023-10-17
Inventor: Seung Yun , Seung Hi Kim , Sanghun KIM , Jeonguk BANG , Min Kyu LEE
Abstract: A method of generating a sympathetic back-channel signal is provided. The method includes receiving a voice signal from a user, determining whether predetermined timing is timing at which a back-channel signal is output in response to the input of the voice signal at the predetermined timing, storing the voice signal that has been input so far if the predetermined timing is the timing at which the back-channel signal is output as a result of the determination, determining back-channel signal information based on the stored voice signal, and outputting the determined back-channel signal information.
-
公开(公告)号:US20230290360A1
公开(公告)日:2023-09-14
申请号:US18085889
申请日:2022-12-21
Inventor: Seung YUN , Jeonguk BANG , Min Kyu LEE , Sanghun KIM
Abstract: An apparatus for improving context-based automatic interpretation performance includes: an uttered voice input unit configured to receive a voice signal from a user; a previous sentence input unit configured to determine whether there is a user’s previous utterance when the voice signal is input by the uttered voice input unit; a voice encoding processing unit configured to decode only the voice signal through the uttered voice input unit when it is determined that there is no user’s previous utterance and extract a vector of the voice signal when it is determined that there is the user’s previous utterance; a context encoding processing unit configured to extract a context vector from a previous utterance when there is the previous utterance and transmit the extracted context vector of the previous utterance; and an interpretation decoding processing unit configured to output an interpretation result text.
-
-
-