Apparatus and method for selecting speaker by using smart glasses

    公开(公告)号:US10796106B2

    公开(公告)日:2020-10-06

    申请号:US16114388

    申请日:2018-08-28

    Abstract: Provided are an apparatus and method for selecting a speaker by using smart glasses. The apparatus includes a camera configured to capture a front angle video of a user and track guest interpretation interlocutors in the captured video, smart glasses configured to display a virtual space map image including the guest interpretation interlocutors tracked through the camera, a gaze-tracking camera configured to select a target person for interpretation by tracking a gaze of the user so that a guest interpretation interlocutor displayed in the video may be selected, and an interpretation target processor configured to provide an interpretation service in connection with the target person selected through the gaze-tracking camera.

    Apparatus for speech recognition using multiple acoustic model and method thereof
    5.
    发明授权
    Apparatus for speech recognition using multiple acoustic model and method thereof 有权
    使用多种声学模型的语音识别装置及其方法

    公开(公告)号:US09378742B2

    公开(公告)日:2016-06-28

    申请号:US13845941

    申请日:2013-03-18

    Inventor: Dong Hyun Kim

    CPC classification number: G10L15/32 G10L15/065

    Abstract: Disclosed are an apparatus for recognizing voice using multiple acoustic models according to the present invention and a method thereof. An apparatus for recognizing voice using multiple acoustic models includes a voice data database (DB) configured to store voice data collected in various noise environments; a model generating means configured to perform classification for each speaker and environment based on the collected voice data, and to generate an acoustic model of a binary tree structure as the classification result; and a voice recognizing means configured to extract feature data of voice data when the voice data is received from a user, to select multiple models from the generated acoustic model based on the extracted feature data, to parallel recognize the voice data based on the selected multiple models, and to output a word string corresponding to the voice data as the recognition result.

    Abstract translation: 公开了根据本发明的使用多个声学模型识别语音的装置及其方法。 一种用于使用多个声学模型识别语音的装置包括:语音数据数据库(DB),被配置为存储在各种噪声环境中收集的语音数据; 模型生成装置,被配置为基于所收集的语音数据对每个说话者和环境进行分类,并且生成作为分类结果的二叉树结构的声学模型; 以及语音识别装置,被配置为当从用户接收到语音数据时提取语音数据的特征数据,基于所提取的特征数据从所生成的声学模型中选择多个模型,以基于所选择的多个并行识别语音数据 模型,并输出与语音数据相对应的字串作为识别结果。

    APPARATUS AND METHOD FOR SELECTING SPEAKER BY USING SMART GLASSES

    公开(公告)号:US20190188265A1

    公开(公告)日:2019-06-20

    申请号:US16114388

    申请日:2018-08-28

    Abstract: Provided are an apparatus and method for selecting a speaker by using smart glasses. The apparatus includes a camera configured to capture a front angle video of a user and track guest interpretation interlocutors in the captured video, smart glasses configured to display a virtual space map image including the guest interpretation interlocutors tracked through the camera, a gaze-tracking camera configured to select a target person for interpretation by tracking a gaze of the user so that a guest interpretation interlocutor displayed in the video may be selected, and an interpretation target processor configured to provide an interpretation service in connection with the target person selected through the gaze-tracking camera.

    Terminal and server of speaker-adaptation speech-recognition system and method for operating the system
    7.
    发明授权
    Terminal and server of speaker-adaptation speech-recognition system and method for operating the system 有权
    扬声器适配语音识别系统的终端和服务器以及操作系统的方法

    公开(公告)号:US09530403B2

    公开(公告)日:2016-12-27

    申请号:US14709359

    申请日:2015-05-11

    Inventor: Dong Hyun Kim

    CPC classification number: G10L15/07 G10L15/30 G10L2015/221

    Abstract: Provided are a terminal and server of a speaker-adaptation speech-recognition system and a method for operating the system. The terminal in the speaker-adaptation speech-recognition system includes a speech recorder which transmits speech data of a speaker to a speech-recognition server, a statistical variable accumulator which receives a statistical variable including acoustic statistical information about speech of the speaker from the speech-recognition server which recognizes the transmitted speech data, and accumulates the received statistical variable, a conversion parameter generator which generates a conversion parameter about the speech of the speaker using the accumulated statistical variable and transmits the generated conversion parameter to the speech-recognition server, and a result displaying user interface which receives and displays result data when the speech-recognition server recognizes the speech data of the speaker using the transmitted conversion parameter and transmits the recognized result data.

    Abstract translation: 提供了一种扬声器适配语音识别系统的终端和服务器以及用于操作该系统的方法。 扬声器适配语音识别系统中的终端包括将语音数据发送到语音识别服务器的语音记录器,统计变量累加器,其从语音接收包括关于说话者的语音的声学统计信息 识别所发送的语音数据并累加接收到的统计变量,转换参数生成器,其使用累积的统计变量生成关于说话者的语音的转换参数,并将生成的转换参数发送到语音识别服务器, 并且显示用户界面的结果,其在语音识别服务器使用所发送的转换参数识别说话者的语音数据时接收并显示结果数据,并发送所识别的结果数据。

Patent Agency Ranking