-
1.
公开(公告)号:US20240153513A1
公开(公告)日:2024-05-09
申请号:US18502648
申请日:2023-11-06
发明人: Byeong Ho CHO , Seung Kwon BEACK , Jong Mo SUNG , Tae Jin LEE , Woo Taek LIM , In Seon JANG
IPC分类号: G10L19/035
CPC分类号: G10L19/035
摘要: A complex number quantization-based audio signal encoding method may comprise: estimating a scale factor for each subband of an input audio signal; performing complex magnitude scaling for each subband based on the scale factor; and performing polar quantization on a complex frequency coefficient for each subband, wherein the performing the polar quantization for each subband comprises applying two or more different magnitude quantization techniques based on the magnitude of the complex frequency coefficient scaled for each subband.
-
2.
公开(公告)号:US20230298603A1
公开(公告)日:2023-09-21
申请号:US18150126
申请日:2023-01-04
发明人: In Seon JANG , Seung Kwon BEACK , Jong Mo SUNG , Tae Jin LEE , Woo Taek LIM , Byeong Ho CHO
IPC分类号: G10L19/032 , G10L25/30 , G10L19/04 , G06N7/01
CPC分类号: G10L19/032 , G10L25/30 , G10L19/04 , G06N7/01
摘要: A method for encoding an input signal using N flow blocks (N is a natural number greater than or equal to 2) and (N−1) split block(s), which is performed by a processor, may comprise: transmitting, by a k-th flow block (k is a natural number greater than or equal to 1 and less than or equal to N−1) among the N flow blocks, a k-th transformation signal obtained by transforming a received signal into a latent representation to a k-th split block among the (N−1) split block(s); splitting, by the k-th split block, the k-th transformation signal by a predetermined ratio, into a first split signal and a second split signal; transmitting, by the k-th split block, the first split signal to a (k+1)-th flow block; and quantizing a signal transformed by an N-th flow block and the second split signals using a quantization block.
-
公开(公告)号:US20150149166A1
公开(公告)日:2015-05-28
申请号:US14172998
申请日:2014-02-05
发明人: In Seon JANG , Woo Taek LIM
IPC分类号: G10L25/78
CPC分类号: G10L25/78
摘要: Provided is an apparatus for detecting a speech/non-speech section. The apparatus includes an acquisition unit which obtains inter-channel relation information of a stereo audio signal, a classification unit which classifies each element of the stereo audio signal into a center channel element and a surround element on the basis of the inter-channel relation information, a calculation unit which calculates an energy ratio value between a center channel signal composed of center channel elements and a surround channel signal composed of surround elements, for each frame, and an energy ratio value between the stereo audio signal and a mono signal generated on the basis of the stereo audio signal, and a judgment unit which determines a speech section and a non-speech section from the stereo audio signal by comparing the energy ratio values.
摘要翻译: 提供了一种用于检测语音/非语音部分的装置。 该装置包括获取立体声音频信号的信道间关系信息的获取单元,基于信道间关系信息将立体声音频信号的每个元素分类为中心信道单元和环绕元素的分类单元 计算单元,其对于每个帧计算由中心声道元素构成的中心声道信号和由环绕声元素构成的环绕声道信号之间的能量比值,以及立体声音频信号与在 立体声音频信号的基础,以及判断单元,其通过比较能量比值来确定来自立体声音频信号的语音部分和非语音部分。
-
4.
公开(公告)号:US20230267950A1
公开(公告)日:2023-08-24
申请号:US18097062
申请日:2023-01-13
申请人: Electronics and Telecommunications Research Institute , Industry-Academic Cooperation Foundation, Yonsei University
发明人: In Seon JANG , Seung Kwon BEACK , Jong Mo SUNG , Tae Jin LEE , Woo Taek LIM , Byeong Ho CHO , Hong Goo KANG , Ji Hyun LEE , Chan Woo LEE , Hyung Seob LIM
摘要: A generative adversarial network-based audio signal generation model for generating a high quality audio signal may comprise: a generator generating an audio signal with an external input; a harmonic-percussive separation model separating the generated audio signal into a harmonic component signal and a percussive component signal; and at least one discriminator evaluating whether each of the harmonic component signal and the percussive component signal is real or fake.
-
5.
公开(公告)号:US20230267940A1
公开(公告)日:2023-08-24
申请号:US18097054
申请日:2023-01-13
申请人: Electronics and Telecommunications Research Institute , Industry-Academic Cooperation Foundation, Yonsei University
发明人: In Seon JANG , Seung Kwon BEACK , Jong Mo SUNG , Tae Jin LEE , Woo Taek LIM , Byeong Ho CHO , Hong Goo KANG , Ji Hyun LEE , Chan Woo LEE , Hyung Seob LIM
IPC分类号: G10L19/038 , G10L19/002
CPC分类号: G10L19/038 , G10L19/002
摘要: A method, executed by a processor for compressing an audio signal in multiple layers, may comprise: (a) restoring, in a highest layer, an input audio signal as a first signal; (b) restoring, in at least one intermediate layer, a signal obtained by subtracting an upsampled signal, which is obtained by upsampling the audio signal restored in the highest layer or an immediately previous intermediate layer, from the input audio signal as a second signal; and (c) restoring, in a lowest layer, a signal obtained by subtracting an upsampled signal, which is obtained by upsampling the audio signal restored in an intermediate layer immediately before the lowest layer, from the input audio signal as a third signal, wherein the first signal, the second signal, and the third signal are combined to output a final restoration audio signal.
-
-
-
-