-
公开(公告)号:US20250104724A1
公开(公告)日:2025-03-27
申请号:US18886296
申请日:2024-09-16
Applicant: Electronics and Telecommunications Research Institute , The Trustees of Indiana University
Inventor: Inseon JANG , Soo Young PARK , Seung Kwon BEACK , Jongmo SUNG , Woo-taek LIM , Byeongho CHO , Jung Won KANG , Tae Jin LEE , Minje KIM , Haici YANG
IPC: G10L19/16
Abstract: A method and apparatus for encoding/decoding a neural network-based personalized speech are provided. The method includes outputting a first bit stream in which an input speech signal is encrypted, based on the input speech signal, and outputting a second bit stream in which speaker information of the input speech signal is encrypted, based on the input speech signal.
-
32.
公开(公告)号:US20250104722A1
公开(公告)日:2025-03-27
申请号:US18886765
申请日:2024-09-16
Applicant: Electronics and Telecommunications Research Institute , The Trustees of Indiana University
Inventor: Inseon JANG , Woo-taek LIM , Soo Young PARK , Seung Kwon BEACK , Jongmo SUNG , Byeongho CHO , Jung Won KANG , Tae Jin LEE , Minje KIM , Haici YANG
IPC: G10L19/038 , G10L21/0208 , G10L25/30
Abstract: A method and device for encoding/decoding an audio signal based on dequantization through potential diffusion are provided. The method of decoding an audio signal includes obtaining a discrete latent vector in which a speech signal is quantized and based on the discrete latent vector, outputting a continuous latent vector in which the discrete latent vector is dequantized.
-
公开(公告)号:US20240371383A1
公开(公告)日:2024-11-07
申请号:US18653233
申请日:2024-05-02
Applicant: Electronics and Telecommunications Research Institute , UIF (University Industry Foundation), Yonsei University
Inventor: Inseon JANG , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Woo-taek LIM , Byeongho CHO , Hong-Goo KANG , Byeong Hyeon KIM , Jihyun LEE , Hyungseob LIM
IPC: G10L19/038 , G10L19/02
Abstract: A method and apparatus for encoding/decoding audio signal are provided. The encoding method includes transforming an input audio signal in a time domain into an audio signal in a frequency domain, quantizing energy of a frequency band of the audio signal in the frequency domain, generating a normal signal by normalizing the audio signal in the frequency domain according to quantized energy, obtaining a feature vector including information on the energy of the frequency band based on the normal signal and the input audio signal, quantizing the feature vector, obtaining a scale factor used to scale the normal signal based on the quantized feature vector, quantizing an adjustment signal into which the normal signal has been scaled based on the scale factor, and outputting bitstreams based on the quantized energy, the quantized feature vector, and the quantized adjustment signal.
-
公开(公告)号:US20240144943A1
公开(公告)日:2024-05-02
申请号:US18473791
申请日:2023-09-25
Applicant: Electronics and Telecommunications Research Institute , The Trustees of Indiana University
Inventor: Woo-taek LIM , Seung Kwon BEACK , Inseon JANG , Jongmo SUNG , Tae Jin LEE , Byeongho CHO , Minje KIM , Darius Petermann
IPC: G10L19/038 , G10L25/18
CPC classification number: G10L19/038 , G10L25/18
Abstract: An audio signal encoding/decoding method and an apparatus for performing the same are disclosed. The audio signal encoding method includes obtaining a full-band input signal, extracting a first feature vector corresponding to a first sub-band signal and a second feature vector corresponding to a second sub-band signal using an encoder neural network including a plurality of encoding layers, generating a first code vector corresponding to the first feature vector and a second code vector corresponding to the second feature vector by compressing the first feature vector and the second feature vector, and generating a bitstream by quantizing the first code vector and the second code vector.
-
35.
公开(公告)号:US20240087577A1
公开(公告)日:2024-03-14
申请号:US18014924
申请日:2021-07-02
Inventor: Seung Kwon BEACK , Jongmo SUNG , Mi Suk LEE , Tae Jin LEE , Woo-taek LIM , Inseon JANG
IPC: G10L19/005 , G10L19/16
CPC classification number: G10L19/005 , G10L19/167
Abstract: Disclosed is an apparatus and method for audio encoding/decoding that is robust against coding distortion in a transition section. An audio encoding method includes outputting a frequency domain signal by time-to-frequency (T/F) transform of an input signal, outputting a frequency domain residual signal in which a frequency axis envelope is removed from the frequency domain signal by applying frequency domain noise shaping (FDNS) encoding to the frequency domain signal, outputting a time domain residual signal in which a time axis envelope is removed by performing linear prediction coefficient (LPC) analysis based on the frequency domain residual signal, and quantizing and transmitting the time domain residual signal.
-
公开(公告)号:US20240013796A1
公开(公告)日:2024-01-11
申请号:US18474997
申请日:2023-09-26
Applicant: Electronics and Telecommunications Research Institute , The Trustees of Indiana University
Inventor: Woo-taek LIM , Seung Kwon BEACK , Inseon JANG , Jongmo SUNG , Tae Jin LEE , Byeongho CHO , Minje KIM , Haici YANG
IPC: G10L19/038
CPC classification number: G10L19/038
Abstract: A method of encoding a speech signal includes predicting a feature vector of each of a plurality of frames included in the speech signal based on a ground-truth feature vector of a previous frame of each of the plurality of frames, calculating a residual signal corresponding to each of the plurality of frames based on a ground-truth feature vector of each of the plurality of frames and a predicted feature vector of each of the plurality of frames, and generating a bitstring corresponding to each of the plurality of frames by quantizing the residual signal.
-
公开(公告)号:US20230335145A1
公开(公告)日:2023-10-19
申请号:US18118604
申请日:2023-03-07
Applicant: Electronics and Telecommunications Research Institute , Kyungpook National University Industry-Academic Cooperation Foundation
Inventor: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Inseon JANG , Min Han KIM , Seung Hyeon SHIN , Dae Ho LEE , Seok Jin LEE
IPC: G10L19/06 , G10L19/032
CPC classification number: G10L19/06 , G10L19/032
Abstract: A signal compression method and apparatus and a signal restoration method and apparatus are provided. The signal compression method includes outputting an input signal, obtained by processing an audio signal, which is input, based on a human auditory perception characteristic, using an auditory perception model, extracting a feature vector from the input signal using a feature extraction module, and outputting a code obtained by compressing the feature vector using a trained signal compression model.
-
38.
公开(公告)号:US20230317089A1
公开(公告)日:2023-10-05
申请号:US18103993
申请日:2023-01-31
Inventor: Jongmo SUNG , Seung Kwon BEACK , Tae Jin LEE , Woo-taek LIM , Inseon JANG , Byeongho CHO
IPC: G10L19/13 , G10L19/038 , G10L25/30
CPC classification number: G10L19/13 , G10L19/038 , G10L25/30
Abstract: An encoding method, a decoding method, an encoder for performing the encoding method, and a decoder for performing the decoding method are provided. The encoding method includes outputting LP coefficients bitstream and a residual signal by performing an LP analysis on an input signal, outputting a first latent signal obtained by encoding a periodic component of the residual signal, a second latent signal obtained by encoding a non-periodic component of the residual signal, and a weight vector for each of the first latent signal and the second latent signal, using a first neural network module, and outputting a first bitstream obtained by quantizing the first latent signal, a second bitstream obtained by quantizing the second latent signal, and a weight bitstream obtained by quantizing the weight vector, using a quantization module.
-
39.
公开(公告)号:US20230298603A1
公开(公告)日:2023-09-21
申请号:US18150126
申请日:2023-01-04
Inventor: In Seon JANG , Seung Kwon BEACK , Jong Mo SUNG , Tae Jin LEE , Woo Taek LIM , Byeong Ho CHO
IPC: G10L19/032 , G10L25/30 , G10L19/04 , G06N7/01
CPC classification number: G10L19/032 , G10L25/30 , G10L19/04 , G06N7/01
Abstract: A method for encoding an input signal using N flow blocks (N is a natural number greater than or equal to 2) and (N−1) split block(s), which is performed by a processor, may comprise: transmitting, by a k-th flow block (k is a natural number greater than or equal to 1 and less than or equal to N−1) among the N flow blocks, a k-th transformation signal obtained by transforming a received signal into a latent representation to a k-th split block among the (N−1) split block(s); splitting, by the k-th split block, the k-th transformation signal by a predetermined ratio, into a first split signal and a second split signal; transmitting, by the k-th split block, the first split signal to a (k+1)-th flow block; and quantizing a signal transformed by an N-th flow block and the second split signals using a quantization block.
-
40.
公开(公告)号:US20230298599A1
公开(公告)日:2023-09-21
申请号:US18108431
申请日:2023-06-12
Inventor: Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Woo-taek LIM , Inseon JANG , Byeongho CHO
IPC: G10L19/008
CPC classification number: G10L19/008
Abstract: An encoding method and an encoding device using a complex signal and a decoding method and a decoding device using a complex signal are provided. The encoding method includes converting a first channel signal and a second channel signal constituting an audio signal corresponding to a stereo signal from a real domain to a complex domain, determining one of a sum operation, a difference operation, and a bypass operation to be performed on the second channel signal converted to the complex domain, determining a complex spatial cue according to the determined operation, converting a residual signal for the second channel signal to a real domain using the complex spatial cue, converting the first channel signal to a real domain, encoding the first channel signal converted to the real domain, and encoding the residual signal for the second channel signal converted to the real domain.
-
-
-
-
-
-
-
-
-