-
公开(公告)号:US20230335145A1
公开(公告)日:2023-10-19
申请号:US18118604
申请日:2023-03-07
申请人: Electronics and Telecommunications Research Institute , Kyungpook National University Industry-Academic Cooperation Foundation
发明人: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Inseon JANG , Min Han KIM , Seung Hyeon SHIN , Dae Ho LEE , Seok Jin LEE
IPC分类号: G10L19/06 , G10L19/032
CPC分类号: G10L19/06 , G10L19/032
摘要: A signal compression method and apparatus and a signal restoration method and apparatus are provided. The signal compression method includes outputting an input signal, obtained by processing an audio signal, which is input, based on a human auditory perception characteristic, using an auditory perception model, extracting a feature vector from the input signal using a feature extraction module, and outputting a code obtained by compressing the feature vector using a trained signal compression model.
-
公开(公告)号:US20240135941A1
公开(公告)日:2024-04-25
申请号:US18358646
申请日:2023-07-24
申请人: Electronics and Telecommunications Research Institute , Gwangju Institute of Science and Technology
发明人: Inseon JANG , Seung Kwon BEACK , Tae Jin LEE , Jongmo SUNG , Woo-taek LIM , Byeongho CHO , Jongwon SHIN
IPC分类号: G10L19/02
CPC分类号: G10L19/02
摘要: Provided is an encoding apparatus including a memory configured to store instructions and a processor electrically connected to the memory and configured to execute the instructions, wherein the processor may be configured to perform a plurality of operations, when the instructions are executed by the processor, wherein the plurality of operations may include obtaining an input audio signal, generating an embedded audio signal by embedding signal components of a second frequency band of the input audio signal in a first frequency band of the input audio signal, generating additional information associated with the first frequency band and the second frequency band, generating an encoded audio signal by encoding the embedded audio signal, and formatting the encoded audio signal and the additional information into a bitstream.
-
公开(公告)号:US20220005487A1
公开(公告)日:2022-01-06
申请号:US17368390
申请日:2021-07-06
发明人: Jongmo SUNG , Seung Kwon BEACK , Mi Suk LEE , Tae Jin LEE , Woo-taek LIM , Inseon JANG
IPC分类号: G10L19/032
摘要: An audio signal encoding and decoding method using a neural network model, a method of training the neural network model, and an encoder and decoder performing the methods are disclosed. The encoding method includes computing the first feature information of an input signal using a recurrent encoding model, computing an output signal from the first feature information using a recurrent decoding model, calculating a residual signal by subtracting the output signal from the input signal, computing the second feature information of the residual signal using a nonrecurrent encoding model, and converting the first feature information and the second feature information to a bitstream.
-
公开(公告)号:US20190180763A1
公开(公告)日:2019-06-13
申请号:US16180298
申请日:2018-11-05
发明人: Seung Kwon BEACK , Woo-taek LIM , Jongmo SUNG , Mi Suk LEE , Tae Jin LEE , Hui Yong KIM
摘要: A method of predicting a channel parameter of an original signal from a downmix signal is disclosed. The method may include generating an input feature map to be used to predict a channel parameter of the original signal based on a downmix signal of an original signal, determining an output feature map including a predicted parameter to be used to predict the channel parameter by applying the input feature map to a neural network, generating a label map including information associated with the channel parameter of the original signal, and predicting the channel parameter of the original signal by comparing the output feature map and the label map.
-
5.
公开(公告)号:US20180144755A1
公开(公告)日:2018-05-24
申请号:US15710353
申请日:2017-09-20
发明人: Mi Suk LEE , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE
IPC分类号: G10L19/018 , H04H20/31 , H04H60/37 , H04H60/58 , H04N21/2389 , H04N21/8358 , H04N5/067
CPC分类号: G10L19/018 , G06F16/683 , G06F16/955 , H04H20/31 , H04H60/37 , H04H60/58 , H04H2201/50 , H04N5/0675 , H04N21/23892 , H04N21/4394 , H04N21/8358
摘要: Disclosed is an audio watermark insertion method. The audio watermark insertion method includes performing a modulated complex lapped transform (MCLT) on a first audio signal, inserting a bit string of a watermark in the first audio signal obtained by performing the MCLT, performing an inverse modified discrete cosine transform (IMDCT) on the first audio signal in which the bit string is inserted, and obtaining a second audio signal, which is the first audio signal in which the watermark is inserted, by performing an overlap-add on a signal obtained by performing the IMDCT and a neighbor frame signal.
-
6.
公开(公告)号:US20230274141A1
公开(公告)日:2023-08-31
申请号:US18166407
申请日:2023-02-08
申请人: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , YONSEI UNIVERSITY WONJU INDUSTRY-ACADEMIC COOPERATION FOUNDATION
发明人: Jongmo SUNG , Seung Kwon BEACK , Tae Jin LEE , Woo-taek LIM , Inseon JANG , Byeongho CHO , Young Cheol PARK , Joon BYUN , Seungmin SHIN
IPC分类号: G06N3/08 , G10L19/038 , G10L25/30 , G10L19/028 , G10L25/69 , G10L25/60
CPC分类号: G06N3/08 , G10L19/038 , G10L25/30 , G10L19/028 , G10L25/69 , G10L25/60
摘要: Provided is a method and apparatus for designing and testing an audio codec using quantization based on white noise modeling. A neural network-based audio encoder design method includes generating a quantized latent vector and a reconstructed signal corresponding to an input signal by using a white noise modeling-based quantization process, computing a total loss for training a neural network-based audio codec, based on the input signal, the reconstruction signal, and the quantized latent vector, training the neural network-based audio codec by using the total loss, and validating the trained neural network-based audio codec to select the best neural network-based audio codec.
-
公开(公告)号:US20230048402A1
公开(公告)日:2023-02-16
申请号:US17884364
申请日:2022-08-09
发明人: Jongmo SUNG , Seung Kwon BEACK , Tae Jin LEE , Woo-taek KIM , Inseon JANG
摘要: Provided is an encoding method according to various example embodiments and an encoder performing the method. The encoding method includes outputting a linear prediction(LP) coefficients bitstream and a residual signal by performing a linear prediction analysis on an input signal, outputting a first latent signal obtained by encoding a periodic component of the residual signal, using a first neural network module, outputting a first bitstream obtained by quantizing the first latent signal, using a quantization module, outputting a second latent signal obtained by encoding an aperiodic component of the residual signal, using the first neural network module, and outputting a second bitstream obtained by quantizing the second latent signal, using the quantization module, wherein the aperiodic component of the residual signal is calculated based on a periodic component of the residual signal decoded from the quantized first latent signal output by de-quantizing the first bitstream.
-
公开(公告)号:US20220262378A1
公开(公告)日:2022-08-18
申请号:US17672041
申请日:2022-02-15
发明人: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Inseon JANG
摘要: An audio signal encoding and decoding method using a learning model, a training method of the learning model, and an encoder and decoder that perform the method, are disclosed. The audio signal decoding method may include extracting a first residual signal and a first linear prediction coefficient by decoding a bitstream received from an encoder, generating a first audio signal from the first residual signal using the first linear prediction coefficient, generating a second linear prediction coefficients and a second residual signal from the first audio signal, obtaining a third linear prediction coefficient by inputting the second linear prediction coefficient into a trained learning model, and generating a second audio signal from the second residual signal using the third linear prediction coefficient.
-
公开(公告)号:US20220238126A1
公开(公告)日:2022-07-28
申请号:US17570489
申请日:2022-01-07
发明人: Jongmo SUNG , Seung Kwon BEACK , Tae Jin LEE , Woo-taek LIM , Inseon JANG
IPC分类号: G10L19/032 , G10L19/008 , G10L25/90 , G10L25/30
摘要: Methods of encoding and decoding an audio signal using a learning model and an encoder and a decoder for performing the methods are disclosed. A method of encoding an audio signal using a learning model may include extracting pitch information of the audio signal, determining a dilation factor of a receptive field of a first expandable neural network block to extract a feature map from the audio signal based on the pitch information, generating a first feature map of the audio signal using the first expandable neural network block in which the dilation factor is determined, determining a second feature map by inputting the first feature map into a second expandable neural network block to process the first feature map, and converting the second feature map and the pitch information into a bitstream.
-
公开(公告)号:US20220157326A1
公开(公告)日:2022-05-19
申请号:US17507746
申请日:2021-10-21
发明人: Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Woo-taek LIM , Inseon JANG
IPC分类号: G10L19/13 , G10L19/032 , G10L19/06
摘要: A method of generating a residual signal performed by an encoder includes identifying an input signal including an audio sample, generating a first residual signal from the input signal using linear predictive coding (LPC), generating a second residual signal having a less information amount than the first residual signal by transforming the first residual signal, transforming the second residual signal into a frequency domain, and generating a third residual signal having a less information amount than the second residual signal from the transformed second residual signal using frequency-domain prediction (FDP) coding.
-
-
-
-
-
-
-
-
-