Patent search ap:("Electronics AND Telecommunications Research Institute") AND inv:"Dong Hyun Kim" Page 1

1.

发明授权
System and method for automatic speech translation based on zero user interface 有权

公开(公告)号：US11977855B2

公开(公告)日：2024-05-07

申请号：US17522218

申请日：2021-11-09

Applicant: Electronics and Telecommunications Research Institute

Inventor： Sang Hun Kim , Seung Yun , Min Kyu Lee , Joon Gyu Maeng , Dong Hyun Kim

IPC: G06F40/58 , G10L17/02 , G10L17/06 , G10L17/18 , G10L21/0232 , G10L21/0316 , G10L21/04 , G10L25/06 , G10L25/18 , G10L25/21 , G10L25/78

CPC classification number: G06F40/58 , G10L17/02 , G10L17/06 , G10L17/18 , G10L21/0232 , G10L21/0316 , G10L21/04 , G10L25/06 , G10L25/18 , G10L25/21 , G10L25/78

Abstract: The Zero User Interface (UI)-based automatic speech translation system and method can solve problems such as the procedural inconvenience of inputting speech signals and the malfunction of speech recognition due to crosstalk when users who speak difference languages have a face-to-face conversation. The system includes an automatic speech translation server, speaker terminals and a counterpart terminal. The automatic speech translation server selects a speech signal of a speaker among multiple speech signals received from speaker terminals connected to an automatic speech translation service and transmits a result of translating the speech signal of the speaker into a target language to a counterpart terminal.

2.

发明授权
Apparatus and method for selecting speaker by using smart glasses 有权

公开(公告)号：US10796106B2

公开(公告)日：2020-10-06

申请号：US16114388

申请日：2018-08-28

Applicant: Electronics and Telecommunications Research Institute

Inventor： Dong Hyun Kim , Young Jik Lee , Sang Hun Kim

IPC: G06F40/58 , G10L13/033 , G02B27/01 , G10L13/04

Abstract: Provided are an apparatus and method for selecting a speaker by using smart glasses. The apparatus includes a camera configured to capture a front angle video of a user and track guest interpretation interlocutors in the captured video, smart glasses configured to display a virtual space map image including the guest interpretation interlocutors tracked through the camera, a gaze-tracking camera configured to select a target person for interpretation by tracking a gaze of the user so that a guest interpretation interlocutor displayed in the video may be selected, and an interpretation target processor configured to provide an interpretation service in connection with the target person selected through the gaze-tracking camera.

3.

发明授权
Apparatus and method for processing voice signal and terminal 有权

公开(公告)号：US10298736B2

公开(公告)日：2019-05-21

申请号：US15202912

申请日：2016-07-06

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventor： Min Kyu Lee , Sang Hun Kim , Young Ik Kim , Dong Hyun Kim , Mu Yeol Choi

IPC: H04M1/60 , G10L21/02 , H04M1/725 , G10L15/20 , G10L25/78 , G10L15/30

Abstract: A voice signal processing apparatus includes: an input unit which receives a voice signal of a user; a detecting unit which detects an auxiliary signal; and a signal processing unit which transmits the voice signal to an external terminal in a first operation mode and transmits the voice signal and the auxiliary signal to the external terminal using the same or different protocols in a second operation mode.

4.

发明授权
Speech recognition system and method 有权

公开(公告)号：US10249294B2

公开(公告)日：2019-04-02

申请号：US15646302

申请日：2017-07-11

Applicant: Electronics and Telecommunications Research Institute

Inventor： Dong Hyun Kim , Young Jik Lee , Sang Hun Kim , Seung Hi Kim , Min Kyu Lee , Mu Yeol Choi

IPC: G10L15/14 , G10L17/04 , G10L15/065 , G10L15/06 , G10L15/08

Abstract: A speech recognition method capable of automatic generation of phones according to the present invention includes: unsupervisedly learning a feature vector of speech data; generating a phone set by clustering acoustic features selected based on an unsupervised learning result; allocating a sequence of phones to the speech data on the basis of the generated phone set; and generating an acoustic model on the basis of the sequence of phones and the speech data to which the sequence of phones is allocated.

5.

发明授权
Apparatus for speech recognition using multiple acoustic model and method thereof 有权
Title translation: 使用多种声学模型的语音识别装置及其方法

公开(公告)号：US09378742B2

公开(公告)日：2016-06-28

申请号：US13845941

申请日：2013-03-18

Applicant: Electronics and Telecommunications Research Institute

Inventor： Dong Hyun Kim

IPC: G10L15/32 , G10L15/065

CPC classification number: G10L15/32 , G10L15/065

Abstract: Disclosed are an apparatus for recognizing voice using multiple acoustic models according to the present invention and a method thereof. An apparatus for recognizing voice using multiple acoustic models includes a voice data database (DB) configured to store voice data collected in various noise environments; a model generating means configured to perform classification for each speaker and environment based on the collected voice data, and to generate an acoustic model of a binary tree structure as the classification result; and a voice recognizing means configured to extract feature data of voice data when the voice data is received from a user, to select multiple models from the generated acoustic model based on the extracted feature data, to parallel recognize the voice data based on the selected multiple models, and to output a word string corresponding to the voice data as the recognition result.

Abstract translation: 公开了根据本发明的使用多个声学模型识别语音的装置及其方法。一种用于使用多个声学模型识别语音的装置包括：语音数据数据库（DB），被配置为存储在各种噪声环境中收集的语音数据; 模型生成装置，被配置为基于所收集的语音数据对每个说话者和环境进行分类，并且生成作为分类结果的二叉树结构的声学模型; 以及语音识别装置，被配置为当从用户接收到语音数据时提取语音数据的特征数据，基于所提取的特征数据从所生成的声学模型中选择多个模型，以基于所选择的多个并行识别语音数据模型，并输出与语音数据相对应的字串作为识别结果。

6.

发明申请
APPARATUS AND METHOD FOR SELECTING SPEAKER BY USING SMART GLASSES 审中-公开

公开(公告)号：US20190188265A1

公开(公告)日：2019-06-20

申请号：US16114388

申请日：2018-08-28

Applicant: Electronics and Telecommunications Research Institute

Inventor： Dong Hyun Kim , Young Jik Lee , Sang Hun Kim

IPC: G06F17/28 , G10L13/04 , G10L13/033

Abstract: Provided are an apparatus and method for selecting a speaker by using smart glasses. The apparatus includes a camera configured to capture a front angle video of a user and track guest interpretation interlocutors in the captured video, smart glasses configured to display a virtual space map image including the guest interpretation interlocutors tracked through the camera, a gaze-tracking camera configured to select a target person for interpretation by tracking a gaze of the user so that a guest interpretation interlocutor displayed in the video may be selected, and an interpretation target processor configured to provide an interpretation service in connection with the target person selected through the gaze-tracking camera.

7.

发明授权
Terminal and server of speaker-adaptation speech-recognition system and method for operating the system 有权
Title translation: 扬声器适配语音识别系统的终端和服务器以及操作系统的方法

公开(公告)号：US09530403B2

公开(公告)日：2016-12-27

申请号：US14709359

申请日：2015-05-11

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventor： Dong Hyun Kim

IPC: G10L15/02 , G10L15/07 , G10L15/30 , G10L15/22

CPC classification number: G10L15/07 , G10L15/30 , G10L2015/221

Abstract: Provided are a terminal and server of a speaker-adaptation speech-recognition system and a method for operating the system. The terminal in the speaker-adaptation speech-recognition system includes a speech recorder which transmits speech data of a speaker to a speech-recognition server, a statistical variable accumulator which receives a statistical variable including acoustic statistical information about speech of the speaker from the speech-recognition server which recognizes the transmitted speech data, and accumulates the received statistical variable, a conversion parameter generator which generates a conversion parameter about the speech of the speaker using the accumulated statistical variable and transmits the generated conversion parameter to the speech-recognition server, and a result displaying user interface which receives and displays result data when the speech-recognition server recognizes the speech data of the speaker using the transmitted conversion parameter and transmits the recognized result data.

Abstract translation: 提供了一种扬声器适配语音识别系统的终端和服务器以及用于操作该系统的方法。扬声器适配语音识别系统中的终端包括将语音数据发送到语音识别服务器的语音记录器，统计变量累加器，其从语音接收包括关于说话者的语音的声学统计信息识别所发送的语音数据并累加接收到的统计变量，转换参数生成器，其使用累积的统计变量生成关于说话者的语音的转换参数，并将生成的转换参数发送到语音识别服务器，并且显示用户界面的结果，其在语音识别服务器使用所发送的转换参数识别说话者的语音数据时接收并显示结果数据，并发送所识别的结果数据。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification