-
公开(公告)号:US20250149020A1
公开(公告)日:2025-05-08
申请号:US18396025
申请日:2023-12-26
Applicant: Korea Electronics Technology Institute
Inventor: Young Han LEE , Tae Woo KIM , Choong Sang CHO
IPC: G10L13/027 , G10L13/08
Abstract: There is provided a training dataset construction method for speech synthesis through fusion of language, speaker, emotion within an utterance. A training dataset construction method of a speech synthesis model according to an embodiment collects speech data having different speech utterance information, increases the speech data by fusing the collected speech data within one utterance, and generates a training dataset by using the increased speech data. Accordingly, a training dataset for speech synthesis is constructed through fusion of language, speaker, emotion within one utterance, so that quality of speech synthesis of multi-speaker/multi-language/emotion can be enhanced.
-
公开(公告)号:US20250149023A1
公开(公告)日:2025-05-08
申请号:US18390216
申请日:2023-12-20
Applicant: Korea Electronics Technology Institute
Inventor: Tae Woo KIM , Choong Sang CHO , Young Han LEE
IPC: G10L13/10
Abstract: There is provided a speech synthesis system and method with an adjustable utterance length. The speech synthesis method according to an embodiment predicts a duration of each phoneme corresponding to a speech mask from the speech mask and a text to be synthesized with the speech mask, encodes the text to be synthesized and extracts a text sequence which is expressed by feature information of the text, generates a speech frame sequence by regulating a length of each phoneme of the text sequence according to the predicted duration of each phoneme corresponding to the speech mask, and synthesizes a speech from the generated speech frame sequence. Accordingly, a length of a speech to be synthesized can be freely regulated as a user desires by regulating a length of a speech mask.
-
公开(公告)号:US20250095341A1
公开(公告)日:2025-03-20
申请号:US18741942
申请日:2024-06-13
Applicant: Korea Electronics Technology Institute
Inventor: Choong Sang CHO , Young Han LEE , Gui Sik KIM , Tae Woo KIM
IPC: G06V10/776 , G06V10/22 , G06V10/74 , G06V10/764 , G06V10/77
Abstract: There are provided a method and a system for acquiring visual explanation information independent of the purpose, type, and structure of a visual intelligence model. The visual explanation information acquisition system of the visual intelligence model according to an embodiment may input N transformed images which are generated by diversifying an input image to a deep learning-based visual intelligence model and may acquire outputted results, may generate attributes of the visual intelligence model from the acquired results, may derive, from losses of the visual intelligence model which are calculated from the generated attributes, basic data for generating a visual explanation map for visually explaining a result derivation rationale of the visual intelligence model, and may generate a visual explanation map from the derived basic data. Accordingly, visual explanation information may be acquired from various visual intelligence models through one system independently of the purpose, type, and structure of the visual intelligence model.
-
-