-
公开(公告)号:US11562757B2
公开(公告)日:2023-01-24
申请号:US17377157
申请日:2021-07-15
Inventor: Seung Kwon Beack , Jongmo Sung , Mi Suk Lee , Tae Jin Lee , Woo-taek Lim , Inseon Jang , Jin Soo Choi
IPC: G10L19/06 , G10L19/032
Abstract: An audio signal encoding method performed by an encoder includes identifying a time-domain audio signal in a unit of blocks, quantizing a linear prediction coefficient extracted from a combined block in which a current original block of the audio signal and a previous original block chronologically adjacent to the current original block using frequency-domain linear predictive coding (LPC), generating a temporal envelope by dequantizing the quantized linear prediction coefficient, extracting a residual signal from the combined block based on the temporal envelope, quantizing the residual signal by one of time-domain quantization and frequency-domain quantization, and transforming the quantized residual signal and the quantized linear prediction coefficient into a bitstream.
-
公开(公告)号:US11456001B2
公开(公告)日:2022-09-27
申请号:US16814103
申请日:2020-03-10
Applicant: Electronics and Telecommunications Research Institute , Kwangwoon University Industry-Academic Collaboration Foundation
Inventor: Seung Kwon Beack , Jongmo Sung , Mi Suk Lee , Tae Jin Lee , Hochong Park
IPC: G10L19/02 , G06N3/04 , G10L21/038 , G10L19/032
Abstract: Disclosed are a method of encoding a high band of an audio, a method of decoding a high band of an audio, and an encoder and a decoder for performing the methods. The method of decoding a high band of an audio, the method performed by a decoder, includes identifying a parameter extracted through a first neural network, identifying side information extracted through a second neural network, and restoring a high band of an audio by applying the parameter and the side information to a third neural network.
-
公开(公告)号:US11133015B2
公开(公告)日:2021-09-28
申请号:US16180298
申请日:2018-11-05
Inventor: Seung Kwon Beack , Woo-taek Lim , Jongmo Sung , Mi Suk Lee , Tae Jin Lee , Hui Yong Kim
IPC: G10L19/04 , G10L25/30 , G10L19/008
Abstract: A method of predicting a channel parameter of an original signal from a downmix signal is disclosed. The method may include generating an input feature map to be used to predict a channel parameter of the original signal based on a downmix signal of an original signal, determining an output feature map including a predicted parameter to be used to predict the channel parameter by applying the input feature map to a neural network, generating a label map including information associated with the channel parameter of the original signal, and predicting the channel parameter of the original signal by comparing the output feature map and the label map.
-
公开(公告)号:US12205605B2
公开(公告)日:2025-01-21
申请号:US17670172
申请日:2022-02-11
Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , INDUSTRY-ACADEMIC COOPERATION FOUNDATION, YONSEI UNIVERSITY
Inventor: Inseon Jang , Seung Kwon Beack , Jongmo Sung , Tae Jin Lee , Woo-Taek Lim , Hong-Goo Kang , Jihyun Lee , Chanwoo Lee , Hyungseob Lim
IPC: G10L19/038 , G10L19/00 , G10L25/30
Abstract: An audio signal encoding and decoding method using a neural network model, and an encoder and decoder for performing the same are disclosed. A method of encoding an audio signal using a neural network model, the method may include identifying an input signal, generating a quantized latent vector by inputting the input signal into a neural network model encoding the input signal, and generating a bitstream corresponding to the quantized latent vector, wherein the neural network model may include i) a feature extraction layer generating a latent vector by extracting a feature of the input signal, ii) a plurality of downsampling blocks downsampling the latent vector, and iii) a plurality of quantization blocks performing quantization of a downsampled latent vector.
-
公开(公告)号:US12159640B2
公开(公告)日:2024-12-03
申请号:US17884364
申请日:2022-08-09
Inventor: Jongmo Sung , Seung Kwon Beack , Tae Jin Lee , Woo-taek Lim , Inseon Jang
Abstract: Provided is an encoding method according to various example embodiments and an encoder performing the method. The encoding method includes outputting a linear prediction (LP) coefficients bitstream and a residual signal by performing a linear prediction analysis on an input signal, outputting a first latent signal obtained by encoding a periodic component of the residual signal, using a first neural network module, outputting a first bitstream obtained by quantizing the first latent signal, using a quantization module, outputting a second latent signal obtained by encoding an aperiodic component of the residual signal, using the first neural network module, and outputting a second bitstream obtained by quantizing the second latent signal, using the quantization module, wherein the aperiodic component of the residual signal is calculated based on a periodic component of the residual signal decoded from the quantized first latent signal output by de-quantizing the first bitstream.
-
公开(公告)号:US11978465B2
公开(公告)日:2024-05-07
申请号:US17507746
申请日:2021-10-21
Inventor: Seung Kwon Beack , Jongmo Sung , Tae Jin Lee , Woo-taek Lim , Inseon Jang
IPC: G10L19/13 , G10L19/032 , G10L19/06
CPC classification number: G10L19/13 , G10L19/032 , G10L19/06
Abstract: A method of generating a residual signal performed by an encoder includes identifying an input signal including an audio sample, generating a first residual signal from the input signal using linear predictive coding (LPC), generating a second residual signal having a less information amount than the first residual signal by transforming the first residual signal, transforming the second residual signal into a frequency domain, and generating a third residual signal having a less information amount than the second residual signal from the transformed second residual signal using frequency-domain prediction (FDP) coding.
-
17.
公开(公告)号:US11804230B2
公开(公告)日:2023-10-31
申请号:US17711908
申请日:2022-04-01
Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , Gwangju Institute of Science and Technology
Inventor: Inseon Jang , Seung Kwon Beack , Jongmo Sung , Tae Jin Lee , Woo-taek Lim , Jongwon Shin , Youngju Cheon , Sangwook Han , Soojoong Hwang
IPC: G10L19/02 , G10L19/038 , G06N3/04
CPC classification number: G10L19/038 , G06N3/04 , G10L19/02
Abstract: An audio encoding/decoding apparatus and method using vector quantized residual error features are disclosed. An audio signal encoding method includes outputting a bitstream of a main codec by encoding an original signal, decoding the bitstream of the main codec, determining a residual error feature vector from a feature vector of a decoded signal and a feature vector of the original signal, and outputting a bitstream of additional information by encoding the residual error feature vector.
-
公开(公告)号:US11508385B2
公开(公告)日:2022-11-22
申请号:US16686859
申请日:2019-11-18
Inventor: Seung Kwon Beack , Jongmo Sung , Mi Suk Lee , Tae Jin Lee , Hui Yong Kim
IPC: G06N3/04 , G06N3/08 , G10L19/032 , G10L19/02
Abstract: Disclosed is a method of processing a residual signal for audio coding and an audio coding apparatus. The method learns a feature map of a reference signal through a residual signal learning engine including a convolutional layer and a neural network and performs learning based on a result obtained by mapping a node of an output layer of the neural network and a quantization level of index of the residual signal.
-
公开(公告)号:US11488613B2
公开(公告)日:2022-11-01
申请号:US17098090
申请日:2020-11-13
Applicant: Electronics and Telecommunications Research Institute , The Trustees of Indiana University
Inventor: Minje Kim , Kai Zhen , Mi Suk Lee , Seung Kwon Beack , Jongmo Sung , Tae Jin Lee , Jin Soo Choi
IPC: G10L19/08 , G10L19/032 , G10L19/26 , G06N3/08 , G10L25/30 , G10L13/02 , G10L21/0208
Abstract: Disclosed are a method for coding a residual signal of LPC coefficients based on collaborative quantization and a computing device for performing the method. The residual signal coding method includes: generating encoded LPC coefficients and LPC residual signals by performing LPC analysis and quantization on an input speech; Determining a predicted LPC residual signal by applying the LPC residual signal to cross module residual learning; Performing LPC synthesis using the coded LPC coefficients and the predicted LPC residual signal; It may include the step of determining an output speech that is a synthesized output according to a result of performing the LPC synthesis.
-
公开(公告)号:US11416742B2
公开(公告)日:2022-08-16
申请号:US16122708
申请日:2018-09-05
Applicant: Electronics and Telecommunications Research Institute , THE TRUSTEES OF INDIANA UNIVERSITY
Inventor: Jongmo Sung , Minje Kim , Aswin Sivaraman , Kai Zhen
IPC: G06N3/08 , G10L19/008 , G10L19/032 , G10L25/30 , G10L25/69
Abstract: Provided is a training method of a neural network that is applied to an audio signal encoding method using an audio signal encoding apparatus, the training method including generating a masking threshold of a first audio signal before training is performed, calculating a weight matrix to be applied to a frequency component of the first audio signal based on the masking threshold, generating a weighted error function obtained by correcting a preset error function using the weight matrix, and generating a second audio signal by applying a parameter learned using the weighted error function to the first audio signal.
-
-
-
-
-
-
-
-
-