-
公开(公告)号:US20230377558A1
公开(公告)日:2023-11-23
申请号:US18319946
申请日:2023-05-18
Inventor: Min kyu LEE , Sanghun KIM , Seung YUN , Jeonguk BANG , Namhyeong KIM
IPC: G10L13/027 , G06V40/16 , G06F40/58 , G10L15/00
CPC classification number: G10L13/027 , G06V40/161 , G06F40/58 , G10L15/005
Abstract: The present invention relates to an automatic interpretation method and system for converting only voice of a speaker into a target language. The present invention may significantly improve performance of automatic interpretation with a foreigner to be communicated with, even in a high-noise environment in which multiple speakers utter at the same time by utilizing voice and image information input to a smart device in a complex manner. In addition, the present invention may determine a situation based on text information and image information existing around a user, and reflect the situation information together with multimodal information to an interpretation engine in real time. In addition, the present invention may significantly improve user convenience of an automatic interpretation system by directly augmenting and displaying an interpreted sentence directly next to a speaker image or generating a synthesized sound by distinguishing the interpreted sentence from other speeches.