Patent search ap:("Electronics AND Telecommunications Research Institute") AND inv:"In Seon JANG" Page 1

1.

发明公开
METHOD AND APPARATUS FOR ENCODING AND DECODING AUDIO SIGNAL USING COMPLEX POLAR QUANTIZER 审中-公开

公开(公告)号：US20240153513A1

公开(公告)日：2024-05-09

申请号：US18502648

申请日：2023-11-06

Applicant: Electronics and Telecommunications Research Institute

Inventor： Byeong Ho CHO , Seung Kwon BEACK , Jong Mo SUNG , Tae Jin LEE , Woo Taek LIM , In Seon JANG

IPC: G10L19/035

CPC classification number: G10L19/035

Abstract: A complex number quantization-based audio signal encoding method may comprise: estimating a scale factor for each subband of an input audio signal; performing complex magnitude scaling for each subband based on the scale factor; and performing polar quantization on a complex frequency coefficient for each subband, wherein the performing the polar quantization for each subband comprises applying two or more different magnitude quantization techniques based on the magnitude of the complex frequency coefficient scaled for each subband.

2.

发明公开
AUDIO SIGNAL GENERATION MODEL AND TRAINING METHOD USING GENERATIVE ADVERSARIAL NETWORK 审中-公开

公开(公告)号：US20230267950A1

公开(公告)日：2023-08-24

申请号：US18097062

申请日：2023-01-13

Applicant: Electronics and Telecommunications Research Institute , Industry-Academic Cooperation Foundation, Yonsei University

Inventor： In Seon JANG , Seung Kwon BEACK , Jong Mo SUNG , Tae Jin LEE , Woo Taek LIM , Byeong Ho CHO , Hong Goo KANG , Ji Hyun LEE , Chan Woo LEE , Hyung Seob LIM

IPC: G10L25/51 , G10L25/30

CPC classification number: G10L25/51 , G10L25/30

Abstract: A generative adversarial network-based audio signal generation model for generating a high quality audio signal may comprise: a generator generating an audio signal with an external input; a harmonic-percussive separation model separating the generated audio signal into a harmonic component signal and a percussive component signal; and at least one discriminator evaluating whether each of the harmonic component signal and the percussive component signal is real or fake.

3.

发明公开
AUDIO SIGNAL COMPRESSION METHOD AND APPARATUS USING DEEP NEURAL NETWORK-BASED MULTILAYER STRUCTURE AND TRAINING METHOD THEREOF 审中-公开

公开(公告)号：US20230267940A1

公开(公告)日：2023-08-24

申请号：US18097054

申请日：2023-01-13

Applicant: Electronics and Telecommunications Research Institute , Industry-Academic Cooperation Foundation, Yonsei University

Inventor： In Seon JANG , Seung Kwon BEACK , Jong Mo SUNG , Tae Jin LEE , Woo Taek LIM , Byeong Ho CHO , Hong Goo KANG , Ji Hyun LEE , Chan Woo LEE , Hyung Seob LIM

IPC: G10L19/038 , G10L19/002

CPC classification number: G10L19/038 , G10L19/002

Abstract: A method, executed by a processor for compressing an audio signal in multiple layers, may comprise: (a) restoring, in a highest layer, an input audio signal as a first signal; (b) restoring, in at least one intermediate layer, a signal obtained by subtracting an upsampled signal, which is obtained by upsampling the audio signal restored in the highest layer or an immediately previous intermediate layer, from the input audio signal as a second signal; and (c) restoring, in a lowest layer, a signal obtained by subtracting an upsampled signal, which is obtained by upsampling the audio signal restored in an intermediate layer immediately before the lowest layer, from the input audio signal as a third signal, wherein the first signal, the second signal, and the third signal are combined to output a final restoration audio signal.

4.

发明公开
METHOD FOR ENCODING AND DECODING AUDIO SIGNAL USING NORMALIZING FLOW, AND TRAINING METHOD THEREOF 审中-公开

公开(公告)号：US20230298603A1

公开(公告)日：2023-09-21

申请号：US18150126

申请日：2023-01-04

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventor： In Seon JANG , Seung Kwon BEACK , Jong Mo SUNG , Tae Jin LEE , Woo Taek LIM , Byeong Ho CHO

IPC: G10L19/032 , G10L25/30 , G10L19/04 , G06N7/01

CPC classification number: G10L19/032 , G10L25/30 , G10L19/04 , G06N7/01

Abstract: A method for encoding an input signal using N flow blocks (N is a natural number greater than or equal to 2) and (N−1) split block(s), which is performed by a processor, may comprise: transmitting, by a k-th flow block (k is a natural number greater than or equal to 1 and less than or equal to N−1) among the N flow blocks, a k-th transformation signal obtained by transforming a received signal into a latent representation to a k-th split block among the (N−1) split block(s); splitting, by the k-th split block, the k-th transformation signal by a predetermined ratio, into a first split signal and a second split signal; transmitting, by the k-th split block, the first split signal to a (k+1)-th flow block; and quantizing a signal transformed by an N-th flow block and the second split signals using a quantization block.

5.

发明申请
APPARATUS FOR DEEP LEARNING BASED TEXT-TO-SPEECH SYNTHESIZING BY USING MULTI-SPEAKER DATA AND METHOD FOR THE SAME 审中-公开

公开(公告)号：US20190019500A1

公开(公告)日：2019-01-17

申请号：US16035261

申请日：2018-07-13

Applicant: Electronics and Telecommunications Research Institute , YONSEI UNIVERSITY INDUSTRY FOUNDATION (YONSEI UIF)

Inventor： In Seon JANG , Hong Goo KANG , Hyeon Joo KANG , Young Sun Joo , Chung Hyun AHN , Jeong Il SEO , Seung Jun YANG , Ji Hoon CHOI

IPC: G10L13/04 , G10L15/02 , G10L15/32 , G10L15/06

Abstract: Disclosed is a method and apparatus for training a speech signal. A speech signal training apparatus of the present disclosure may include a target speaker speech database storing a target speaker speech signal; a multi-speaker speech database storing a multi-speaker speech signal; a target speaker acoustic parameter extracting unit extracting an acoustic parameter of a training subject speech signal from the target speaker speech signal; a similar speaker acoustic parameter determining unit extracting at least one similar speaker speech signal from the multi-speaker speech signals, and determining an auxiliary speech feature of the similar speaker speech signal; and an acoustic parameter model training unit determining an acoustic parameter model by performing model training for a relation between the acoustic parameter and text by using the acoustic parameter and the auxiliary speech feature, and setting mapping information of the relation between the acoustic parameter model and the text.

6.

发明申请
METHOD AND APPARATUS FOR DETECTING SPEECH/NON-SPEECH SECTION 有权
Title translation: 用于检测语音/非语音部分的方法和装置

公开(公告)号：US20150149166A1

公开(公告)日：2015-05-28

申请号：US14172998

申请日：2014-02-05

Applicant: Electronics and Telecommunications Research Institute

Inventor： In Seon JANG , Woo Taek LIM

IPC: G10L25/78

CPC classification number: G10L25/78

Abstract: Provided is an apparatus for detecting a speech/non-speech section. The apparatus includes an acquisition unit which obtains inter-channel relation information of a stereo audio signal, a classification unit which classifies each element of the stereo audio signal into a center channel element and a surround element on the basis of the inter-channel relation information, a calculation unit which calculates an energy ratio value between a center channel signal composed of center channel elements and a surround channel signal composed of surround elements, for each frame, and an energy ratio value between the stereo audio signal and a mono signal generated on the basis of the stereo audio signal, and a judgment unit which determines a speech section and a non-speech section from the stereo audio signal by comparing the energy ratio values.

Abstract translation: 提供了一种用于检测语音/非语音部分的装置。该装置包括获取立体声音频信号的信道间关系信息的获取单元，基于信道间关系信息将立体声音频信号的每个元素分类为中心信道单元和环绕元素的分类单元计算单元，其对于每个帧计算由中心声道元素构成的中心声道信号和由环绕声元素构成的环绕声道信号之间的能量比值，以及立体声音频信号与在立体声音频信号的基础，以及判断单元，其通过比较能量比值来确定来自立体声音频信号的语音部分和非语音部分。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification