-
公开(公告)号:US20190164052A1
公开(公告)日:2019-05-30
申请号:US16122708
申请日:2018-09-05
Applicant: Electronics and Telecommunications Research Institute , THE TRUSTEES OF INDIANA UNIVERSITY
Inventor: Jongmo SUNG , Minje KIM , Aswin Sivaraman , Kai Zhen
IPC: G06N3/08 , G10L19/032 , G10L19/008 , G10L25/30
Abstract: Provided is a training method of a neural network that is applied to an audio signal encoding method using an audio signal encoding apparatus, the training method including generating a masking threshold of a first audio signal before training is performed, calculating a weight matrix to be applied to a frequency component of the first audio signal based on the masking threshold, generating a weighted error function obtained by correcting a preset error function using the weight matrix, and generating a second audio signal by applying a parameter learned using the weighted error function to the first audio signal.
-
公开(公告)号:US20240420712A1
公开(公告)日:2024-12-19
申请号:US18732758
申请日:2024-06-04
Inventor: Byeongho CHO , Seung Kwon BEACK , Jung Won KANG , Soo Young PARK , Jongmo SUNG , Tae Jin LEE , Woo-taek LIM , Inseon JANG
IPC: G10L19/028 , G10L19/02 , G10L19/035 , G10L19/06
Abstract: A method of encoding/decoding an audio signal and a device for performing the same are provided. The method of encoding an audio signal includes generating, based on the audio signal, a linear prediction coding (LPC) bitstream and a frequency-domain signal of the audio signal, generating, based on the LPC bitstream and the frequency-domain signal, a first residual signal including information on a frequency envelope of the frequency-domain signal, and outputting a second residual signal by processing a first residual signal through one of a plurality of signal processing paths.
-
13.
公开(公告)号:US20230274141A1
公开(公告)日:2023-08-31
申请号:US18166407
申请日:2023-02-08
Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , YONSEI UNIVERSITY WONJU INDUSTRY-ACADEMIC COOPERATION FOUNDATION
Inventor: Jongmo SUNG , Seung Kwon BEACK , Tae Jin LEE , Woo-taek LIM , Inseon JANG , Byeongho CHO , Young Cheol PARK , Joon BYUN , Seungmin SHIN
IPC: G06N3/08 , G10L19/038 , G10L25/30 , G10L19/028 , G10L25/69 , G10L25/60
CPC classification number: G06N3/08 , G10L19/038 , G10L25/30 , G10L19/028 , G10L25/69 , G10L25/60
Abstract: Provided is a method and apparatus for designing and testing an audio codec using quantization based on white noise modeling. A neural network-based audio encoder design method includes generating a quantized latent vector and a reconstructed signal corresponding to an input signal by using a white noise modeling-based quantization process, computing a total loss for training a neural network-based audio codec, based on the input signal, the reconstruction signal, and the quantized latent vector, training the neural network-based audio codec by using the total loss, and validating the trained neural network-based audio codec to select the best neural network-based audio codec.
-
公开(公告)号:US20230048402A1
公开(公告)日:2023-02-16
申请号:US17884364
申请日:2022-08-09
Inventor: Jongmo SUNG , Seung Kwon BEACK , Tae Jin LEE , Woo-taek KIM , Inseon JANG
Abstract: Provided is an encoding method according to various example embodiments and an encoder performing the method. The encoding method includes outputting a linear prediction(LP) coefficients bitstream and a residual signal by performing a linear prediction analysis on an input signal, outputting a first latent signal obtained by encoding a periodic component of the residual signal, using a first neural network module, outputting a first bitstream obtained by quantizing the first latent signal, using a quantization module, outputting a second latent signal obtained by encoding an aperiodic component of the residual signal, using the first neural network module, and outputting a second bitstream obtained by quantizing the second latent signal, using the quantization module, wherein the aperiodic component of the residual signal is calculated based on a periodic component of the residual signal decoded from the quantized first latent signal output by de-quantizing the first bitstream.
-
公开(公告)号:US20220262378A1
公开(公告)日:2022-08-18
申请号:US17672041
申请日:2022-02-15
Inventor: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Inseon JANG
Abstract: An audio signal encoding and decoding method using a learning model, a training method of the learning model, and an encoder and decoder that perform the method, are disclosed. The audio signal decoding method may include extracting a first residual signal and a first linear prediction coefficient by decoding a bitstream received from an encoder, generating a first audio signal from the first residual signal using the first linear prediction coefficient, generating a second linear prediction coefficients and a second residual signal from the first audio signal, obtaining a third linear prediction coefficient by inputting the second linear prediction coefficient into a trained learning model, and generating a second audio signal from the second residual signal using the third linear prediction coefficient.
-
公开(公告)号:US20220238126A1
公开(公告)日:2022-07-28
申请号:US17570489
申请日:2022-01-07
Inventor: Jongmo SUNG , Seung Kwon BEACK , Tae Jin LEE , Woo-taek LIM , Inseon JANG
IPC: G10L19/032 , G10L19/008 , G10L25/90 , G10L25/30
Abstract: Methods of encoding and decoding an audio signal using a learning model and an encoder and a decoder for performing the methods are disclosed. A method of encoding an audio signal using a learning model may include extracting pitch information of the audio signal, determining a dilation factor of a receptive field of a first expandable neural network block to extract a feature map from the audio signal based on the pitch information, generating a first feature map of the audio signal using the first expandable neural network block in which the dilation factor is determined, determining a second feature map by inputting the first feature map into a second expandable neural network block to process the first feature map, and converting the second feature map and the pitch information into a bitstream.
-
公开(公告)号:US20220157326A1
公开(公告)日:2022-05-19
申请号:US17507746
申请日:2021-10-21
Inventor: Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Woo-taek LIM , Inseon JANG
IPC: G10L19/13 , G10L19/032 , G10L19/06
Abstract: A method of generating a residual signal performed by an encoder includes identifying an input signal including an audio sample, generating a first residual signal from the input signal using linear predictive coding (LPC), generating a second residual signal having a less information amount than the first residual signal by transforming the first residual signal, transforming the second residual signal into a frequency domain, and generating a third residual signal having a less information amount than the second residual signal from the transformed second residual signal using frequency-domain prediction (FDP) coding.
-
公开(公告)号:US20210166701A1
公开(公告)日:2021-06-03
申请号:US17104400
申请日:2020-11-25
Inventor: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Mi Suk LEE , Tae Jin LEE
IPC: G10L19/002
Abstract: An audio signal encoding/decoding device and method using a filter bank is disclosed. The audio signal encoding method includes generating a plurality of first audio signals by performing filtering on an input audio signal using an analysis filter bank, generating a plurality of second audio signals by performing downsampling on the first audio signals, and outputting a bitstream by encoding and quantizing the second audio signals.
-
公开(公告)号:US20240233738A9
公开(公告)日:2024-07-11
申请号:US18358646
申请日:2023-07-25
Applicant: Electronics and Telecommunications Research Institute , Gwangju Institute of Science and Technology
Inventor: Inseon JANG , Seung Kwon BEACK , Tae Jin LEE , Jongmo SUNG , Woo-taek LIM , Byeongho CHO , Jongwon SHIN
IPC: G10L19/02
CPC classification number: G10L19/02
Abstract: Provided is an encoding apparatus including a memory configured to store instructions and a processor electrically connected to the memory and configured to execute the instructions, wherein the processor may be configured to perform a plurality of operations, when the instructions are executed by the processor, wherein the plurality of operations may include obtaining an input audio signal, generating an embedded audio signal by embedding signal components of a second frequency band of the input audio signal in a first frequency band of the input audio signal, generating additional information associated with the first frequency band and the second frequency band, generating an encoded audio signal by encoding the embedded audio signal, and formatting the encoded audio signal and the additional information into a bitstream.
-
公开(公告)号:US20240055009A1
公开(公告)日:2024-02-15
申请号:US18349680
申请日:2023-07-10
Inventor: Byeongho CHO , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Woo-taek LIM , Inseon JANG
IPC: G10L19/032
CPC classification number: G10L19/032
Abstract: Provided are an apparatus for encoding an audio signal and a method of an operation thereof. An audio signal encoding method includes obtaining quantized linear prediction (LP) coefficients by performing a linear predictive coding (LPC) analysis and quantization on an input audio signal, generating a reference signal by applying discrete Fourier transform (DFT) to the input audio signal, obtaining LP residual coefficients from the reference signal, scaling magnitudes of the LP residual coefficients using the quantized LP coefficients and the reference signal, and quantizing phases of the LP residual coefficients and the scaled magnitudes of the LP residual coefficients.
-
-
-
-
-
-
-
-
-