Patent search ap:("Electronics AND Telecommunications Research Institute") AND inv:"Jeong Uk BANG" Page 1

1.

发明申请
SYSTEM, USER TERMINAL, AND METHOD FOR PROVIDING AUTOMATIC INTERPRETATION SERVICE BASED ON SPEAKER SEPARATION 有权

公开(公告)号：US20220215857A1

公开(公告)日：2022-07-07

申请号：US17531316

申请日：2021-11-19

Applicant: Electronics and Telecommunications Research Institute

Inventor： Jeong Uk BANG , Seung YUN , Sang Hun KIM , Min Kyu LEE , Joon Gyu MAENG

IPC: G10L25/84 , G10L15/02 , G10L15/08

Abstract: Provided is a method of performing automatic interpretation based on speaker separation by a user terminal, the method including: receiving a first speech signal including at least one of a user speech of a user and a user surrounding speech around the user from an automatic interpretation service providing terminal, separating the first speech signal into speaker-specific speech signals, performing interpretation on the speaker-specific speech signals in a language selected by the user on the basis of an interpretation mode, and providing a second speech signal generated as a result of the interpretation to at least one of a counterpart terminal and the automatic interpretation service providing terminal according to the interpretation mode.

2.

发明申请
METHOD AND APPARATUS FOR IMPROVING PERFORMANCE OF ARTIFICIAL INTELLIGENCE MODEL USING SPEECH RECOGNITION RESULTS AS TEXT INPUT 有权

公开(公告)号：US20240420682A1

公开(公告)日：2024-12-19

申请号：US18585204

申请日：2024-02-23

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventor： Seung Hi KIM , Jeong Uk BANG , Seung YUN

IPC: G10L15/06 , G10L15/26

Abstract: The present disclosure relates to a method and device for improving the performance of an AI model that uses voice recognition results as text input. A method of training an AI model according to an embodiment of the present disclosure may include: generating first time information on a plurality of words included in a voice and transcription, using a first learning sample including the voice and the transcription; generating second time information by adding a pre-configured delay time to the first time information; generating a modified transcription based on an end time of a last word among the plurality of words and the second time information; and performing training of the AI model based on a second training sample including the voice and the modified transcription.

Patent Agency Ranking