-
51.
公开(公告)号:US20230317089A1
公开(公告)日:2023-10-05
申请号:US18103993
申请日:2023-01-31
Inventor: Jongmo SUNG , Seung Kwon BEACK , Tae Jin LEE , Woo-taek LIM , Inseon JANG , Byeongho CHO
IPC: G10L19/13 , G10L19/038 , G10L25/30
CPC classification number: G10L19/13 , G10L19/038 , G10L25/30
Abstract: An encoding method, a decoding method, an encoder for performing the encoding method, and a decoder for performing the decoding method are provided. The encoding method includes outputting LP coefficients bitstream and a residual signal by performing an LP analysis on an input signal, outputting a first latent signal obtained by encoding a periodic component of the residual signal, a second latent signal obtained by encoding a non-periodic component of the residual signal, and a weight vector for each of the first latent signal and the second latent signal, using a first neural network module, and outputting a first bitstream obtained by quantizing the first latent signal, a second bitstream obtained by quantizing the second latent signal, and a weight bitstream obtained by quantizing the weight vector, using a quantization module.
-
52.
公开(公告)号:US20230298603A1
公开(公告)日:2023-09-21
申请号:US18150126
申请日:2023-01-04
Inventor: In Seon JANG , Seung Kwon BEACK , Jong Mo SUNG , Tae Jin LEE , Woo Taek LIM , Byeong Ho CHO
IPC: G10L19/032 , G10L25/30 , G10L19/04 , G06N7/01
CPC classification number: G10L19/032 , G10L25/30 , G10L19/04 , G06N7/01
Abstract: A method for encoding an input signal using N flow blocks (N is a natural number greater than or equal to 2) and (N−1) split block(s), which is performed by a processor, may comprise: transmitting, by a k-th flow block (k is a natural number greater than or equal to 1 and less than or equal to N−1) among the N flow blocks, a k-th transformation signal obtained by transforming a received signal into a latent representation to a k-th split block among the (N−1) split block(s); splitting, by the k-th split block, the k-th transformation signal by a predetermined ratio, into a first split signal and a second split signal; transmitting, by the k-th split block, the first split signal to a (k+1)-th flow block; and quantizing a signal transformed by an N-th flow block and the second split signals using a quantization block.
-
53.
公开(公告)号:US20230298599A1
公开(公告)日:2023-09-21
申请号:US18108431
申请日:2023-06-12
Inventor: Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Woo-taek LIM , Inseon JANG , Byeongho CHO
IPC: G10L19/008
CPC classification number: G10L19/008
Abstract: An encoding method and an encoding device using a complex signal and a decoding method and a decoding device using a complex signal are provided. The encoding method includes converting a first channel signal and a second channel signal constituting an audio signal corresponding to a stereo signal from a real domain to a complex domain, determining one of a sum operation, a difference operation, and a bypass operation to be performed on the second channel signal converted to the complex domain, determining a complex spatial cue according to the determined operation, converting a residual signal for the second channel signal to a real domain using the complex spatial cue, converting the first channel signal to a real domain, encoding the first channel signal converted to the real domain, and encoding the residual signal for the second channel signal converted to the real domain.
-
54.
公开(公告)号:US20230038394A1
公开(公告)日:2023-02-09
申请号:US17390753
申请日:2021-07-30
Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , The Trustees of Indiana University
Inventor: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Inseon JANG , Minje KIM
IPC: G10L19/008 , G10L19/032 , G06N3/04
Abstract: Disclosed are a method of encoding and decoding an audio signal and an encoder and a decoder performing the method. The method of encoding an audio signal includes identifying an input signal, and generating a bitstring of each encoding layer by applying, to the input signal, an encoding model including a plurality of successive encoding layers that encodes the input signal, in which a current encoding layer among the encoding layers is trained to generate a bitstring of the current encoding layer by encoding an encoded signal which is a signal encoded in a previous encoding layer and quantizing an encoded signal which is a signal encoded in the current encoding layer.
-
55.
公开(公告)号:US20220375483A1
公开(公告)日:2022-11-24
申请号:US17520895
申请日:2021-11-08
Inventor: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Inseon JANG , Jong-won SEOK , YUNSU KIM
IPC: G10L19/038 , G10L19/16 , G10L25/30
Abstract: Disclosed are methods of encoding and decoding an audio signal, and an encoder and a decoder for performing the methods. The method of encoding an audio signal includes identifying an input signal corresponding to a low frequency band of the audio signal, windowing the input signal, generating a first latent vector by inputting the windowed input signal to a first encoding model, transforming the windowed input signal into a frequency domain, generating a second latent vector by inputting the transformed input signal to a second encoding model, generating a final latent vector by combining the first latent vector and the second latent vector, and generating a bitstream corresponding to the final latent vector.
-
公开(公告)号:US20220005486A1
公开(公告)日:2022-01-06
申请号:US17373243
申请日:2021-07-12
Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , KWANGWOON UNIVERSITY INDUSTRY-ACADEMIC COLLABORATION FOUNDATION
Inventor: Seung Kwon BEACK , Tae Jin LEE , Min Je KIM , Dae Young JANG , Kyeongok KANG , Jin Woo HONG , Ho Chong PARK , Young-cheol PARK
IPC: G10L19/02
Abstract: An encoding apparatus and a decoding apparatus in a transform between a Modified Discrete Cosine Transform (MDCT)-based coder and a different coder are provided. The encoding apparatus may encode additional information to restore an input signal encoded according to the MDCT-based coding scheme, when switching occurs between the MDCT-based coder and the different coder. Accordingly, an unnecessary bitstream may be prevented from being generated, and minimum additional information may be encoded.
-
公开(公告)号:US20210366497A1
公开(公告)日:2021-11-25
申请号:US17326035
申请日:2021-05-20
Applicant: Electronics and Telecommunications Research Institute , The Trustees of Indiana University
Inventor: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Mi Suk LEE , Tae Jin LEE , Inseon JANG , Minje KIM , Haici YANG
IPC: G10L19/032
Abstract: Methods of encoding and decoding a speech signal using a neural network model that recognizes sound sources, and encoding and decoding apparatuses for performing the methods are provided. A method of encoding a speech signal includes identifying an input signal for a plurality of sound sources; generating a latent signal by encoding the input signal; obtaining a plurality of sound source signals by separating the latent signal for each of the plurality of sound sources; determining a number of bits used for quantization of each of the plurality of sound source signals according to a type of each of the plurality of sound sources; quantizing each of the plurality of sound source signals based on the determined number of bits; and generating a bitstream by combining the plurality of quantized sound source signals.
-
公开(公告)号:US20210233547A1
公开(公告)日:2021-07-29
申请号:US17156006
申请日:2021-01-22
Applicant: Electronics and Telecommunications Research Institute , The Trustees of Indiana University
Inventor: Mi Suk LEE , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Jin Soo CHOI , Minje KIM , Kai ZHEN
IPC: G10L19/038 , G10L25/18 , G10L25/30 , G10L25/21 , G10L19/028 , G06N3/08
Abstract: A method and apparatus for processing an audio signal are disclosed. According to an example embodiment, a method of processing an audio signal may include acquiring a final audio signal for an initial audio signal using a plurality of neural network models generating output audio signals by encoding and decoding input audio signals, calculating a difference between the initial audio signal and the final audio signal in a time domain, converting the initial audio signal and the final audio signal into Mel-spectra, calculating a difference between the Mel-spectra of the initial audio signal and the final audio signal in a frequency domain, training the plurality of neural network models based on results calculated in the time domain and the frequency domain, and generating a new final audio signal distinguished from the final audio signal from the initial audio signal using the trained neural network models.
-
公开(公告)号:US20210142812A1
公开(公告)日:2021-05-13
申请号:US17098090
申请日:2020-11-13
Applicant: Electronics and Telecommunications Research Institute , The Trustees of Indiana University
Inventor: Minje KIM , Kai ZHEN , Mi Suk LEE , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Jin Soo CHOI
IPC: G10L19/08 , G10L19/032 , G10L19/26 , G10L21/0208 , G10L25/30 , G10L13/02 , G06N3/08
Abstract: Disclosed are a method for coding a residual signal of LPC coefficients based on collaborative quantization and a computing device for performing the method. The residual signal coding method includes: generating encoded LPC coefficients and LPC residual signals by performing LPC analysis and quantization on an input speech; Determining a predicted LPC residual signal by applying the LPC residual signal to cross module residual learning; Performing LPC synthesis using the coded LPC coefficients and the predicted LPC residual signal; It may include the step of determining an output speech that is a synthesized output according to a result of performing the LPC synthesis.
-
公开(公告)号:US20210005209A1
公开(公告)日:2021-01-07
申请号:US16814103
申请日:2020-03-10
Applicant: Electronics and Telecommunications Research Institute , Kwangwoon University Industry-Academic Collaboration Foundation
Inventor: Seung Kwon BEACK , Jongmo SUNG , Mi Suk LEE , Tae Jin LEE , Hochong PARK
IPC: G10L19/02 , G10L19/032 , G10L21/038 , G06N3/04
Abstract: Disclosed are a method of encoding a high band of an audio, a method of decoding a high band of an audio, and an encoder and a decoder for performing the methods. The method of decoding a high band of an audio, the method performed by a decoder, includes identifying a parameter extracted through a first neural network, identifying side information extracted through a second neural network, and restoring a high band of an audio by applying the parameter and the side information to a third neural network.
-
-
-
-
-
-
-
-
-