Patent search ap:("Korea Electronics Technology Institute") AND inv:"Tae Woo KIM" Page 1

1.

发明申请
METHOD OF CONSTRUCTING TRAINING DATASET FOR SPEECH SYNTHESIS THROUGH FUSION OF LANGUAGE, SPEAKER, AND EMOTION WITHIN UTTERANCE 有权

公开(公告)号：US20250149020A1

公开(公告)日：2025-05-08

申请号：US18396025

申请日：2023-12-26

Applicant: Korea Electronics Technology Institute

Inventor： Young Han LEE , Tae Woo KIM , Choong Sang CHO

IPC: G10L13/027 , G10L13/08

Abstract: There is provided a training dataset construction method for speech synthesis through fusion of language, speaker, emotion within an utterance. A training dataset construction method of a speech synthesis model according to an embodiment collects speech data having different speech utterance information, increases the speech data by fusing the collected speech data within one utterance, and generates a training dataset by using the increased speech data. Accordingly, a training dataset for speech synthesis is constructed through fusion of language, speaker, emotion within one utterance, so that quality of speech synthesis of multi-speaker/multi-language/emotion can be enhanced.

2.

发明申请
SPEECH SYNTHESIS SYSTEM AND METHOD WITH ADJUSTABLE UTTERANCE LENGTH 有权

公开(公告)号：US20250149023A1

公开(公告)日：2025-05-08

申请号：US18390216

申请日：2023-12-20

Applicant: Korea Electronics Technology Institute

Inventor： Tae Woo KIM , Choong Sang CHO , Young Han LEE

IPC: G10L13/10

Abstract: There is provided a speech synthesis system and method with an adjustable utterance length. The speech synthesis method according to an embodiment predicts a duration of each phoneme corresponding to a speech mask from the speech mask and a text to be synthesized with the speech mask, encodes the text to be synthesized and extracts a text sequence which is expressed by feature information of the text, generates a speech frame sequence by regulating a length of each phoneme of the text sequence according to the predicted duration of each phoneme corresponding to the speech mask, and synthesizes a speech from the generated speech frame sequence. Accordingly, a length of a speech to be synthesized can be freely regulated as a user desires by regulating a length of a speech mask.

3.

发明申请
METHOD AND SYSTEM FOR ACQUIRING VISUAL EXPLANATION INFORMATION INDEPENDENT OF PURPOSE, TYPE, AND STRUCTURE OF VISUAL INTELLIGENCE MODEL 有权

公开(公告)号：US20250095341A1

公开(公告)日：2025-03-20

申请号：US18741942

申请日：2024-06-13

Applicant: Korea Electronics Technology Institute

Inventor： Choong Sang CHO , Young Han LEE , Gui Sik KIM , Tae Woo KIM

IPC: G06V10/776 , G06V10/22 , G06V10/74 , G06V10/764 , G06V10/77

Abstract: There are provided a method and a system for acquiring visual explanation information independent of the purpose, type, and structure of a visual intelligence model. The visual explanation information acquisition system of the visual intelligence model according to an embodiment may input N transformed images which are generated by diversifying an input image to a deep learning-based visual intelligence model and may acquire outputted results, may generate attributes of the visual intelligence model from the acquired results, may derive, from losses of the visual intelligence model which are calculated from the generated attributes, basic data for generating a visual explanation map for visually explaining a result derivation rationale of the visual intelligence model, and may generate a visual explanation map from the derived basic data. Accordingly, visual explanation information may be acquired from various visual intelligence models through one system independently of the purpose, type, and structure of the visual intelligence model.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification