-
41.
公开(公告)号:US20230038394A1
公开(公告)日:2023-02-09
申请号:US17390753
申请日:2021-07-30
Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , The Trustees of Indiana University
Inventor: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Inseon JANG , Minje KIM
IPC: G10L19/008 , G10L19/032 , G06N3/04
Abstract: Disclosed are a method of encoding and decoding an audio signal and an encoder and a decoder performing the method. The method of encoding an audio signal includes identifying an input signal, and generating a bitstring of each encoding layer by applying, to the input signal, an encoding model including a plurality of successive encoding layers that encodes the input signal, in which a current encoding layer among the encoding layers is trained to generate a bitstring of the current encoding layer by encoding an encoded signal which is a signal encoded in a previous encoding layer and quantizing an encoded signal which is a signal encoded in the current encoding layer.
-
42.
公开(公告)号:US20220375483A1
公开(公告)日:2022-11-24
申请号:US17520895
申请日:2021-11-08
Inventor: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Inseon JANG , Jong-won SEOK , YUNSU KIM
IPC: G10L19/038 , G10L19/16 , G10L25/30
Abstract: Disclosed are methods of encoding and decoding an audio signal, and an encoder and a decoder for performing the methods. The method of encoding an audio signal includes identifying an input signal corresponding to a low frequency band of the audio signal, windowing the input signal, generating a first latent vector by inputting the windowed input signal to a first encoding model, transforming the windowed input signal into a frequency domain, generating a second latent vector by inputting the transformed input signal to a second encoding model, generating a final latent vector by combining the first latent vector and the second latent vector, and generating a bitstream corresponding to the final latent vector.
-
公开(公告)号:US20220005486A1
公开(公告)日:2022-01-06
申请号:US17373243
申请日:2021-07-12
Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , KWANGWOON UNIVERSITY INDUSTRY-ACADEMIC COLLABORATION FOUNDATION
Inventor: Seung Kwon BEACK , Tae Jin LEE , Min Je KIM , Dae Young JANG , Kyeongok KANG , Jin Woo HONG , Ho Chong PARK , Young-cheol PARK
IPC: G10L19/02
Abstract: An encoding apparatus and a decoding apparatus in a transform between a Modified Discrete Cosine Transform (MDCT)-based coder and a different coder are provided. The encoding apparatus may encode additional information to restore an input signal encoded according to the MDCT-based coding scheme, when switching occurs between the MDCT-based coder and the different coder. Accordingly, an unnecessary bitstream may be prevented from being generated, and minimum additional information may be encoded.
-
公开(公告)号:US20210366497A1
公开(公告)日:2021-11-25
申请号:US17326035
申请日:2021-05-20
Applicant: Electronics and Telecommunications Research Institute , The Trustees of Indiana University
Inventor: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Mi Suk LEE , Tae Jin LEE , Inseon JANG , Minje KIM , Haici YANG
IPC: G10L19/032
Abstract: Methods of encoding and decoding a speech signal using a neural network model that recognizes sound sources, and encoding and decoding apparatuses for performing the methods are provided. A method of encoding a speech signal includes identifying an input signal for a plurality of sound sources; generating a latent signal by encoding the input signal; obtaining a plurality of sound source signals by separating the latent signal for each of the plurality of sound sources; determining a number of bits used for quantization of each of the plurality of sound source signals according to a type of each of the plurality of sound sources; quantizing each of the plurality of sound source signals based on the determined number of bits; and generating a bitstream by combining the plurality of quantized sound source signals.
-
公开(公告)号:US20210233547A1
公开(公告)日:2021-07-29
申请号:US17156006
申请日:2021-01-22
Applicant: Electronics and Telecommunications Research Institute , The Trustees of Indiana University
Inventor: Mi Suk LEE , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Jin Soo CHOI , Minje KIM , Kai ZHEN
IPC: G10L19/038 , G10L25/18 , G10L25/30 , G10L25/21 , G10L19/028 , G06N3/08
Abstract: A method and apparatus for processing an audio signal are disclosed. According to an example embodiment, a method of processing an audio signal may include acquiring a final audio signal for an initial audio signal using a plurality of neural network models generating output audio signals by encoding and decoding input audio signals, calculating a difference between the initial audio signal and the final audio signal in a time domain, converting the initial audio signal and the final audio signal into Mel-spectra, calculating a difference between the Mel-spectra of the initial audio signal and the final audio signal in a frequency domain, training the plurality of neural network models based on results calculated in the time domain and the frequency domain, and generating a new final audio signal distinguished from the final audio signal from the initial audio signal using the trained neural network models.
-
公开(公告)号:US20210142812A1
公开(公告)日:2021-05-13
申请号:US17098090
申请日:2020-11-13
Applicant: Electronics and Telecommunications Research Institute , The Trustees of Indiana University
Inventor: Minje KIM , Kai ZHEN , Mi Suk LEE , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Jin Soo CHOI
IPC: G10L19/08 , G10L19/032 , G10L19/26 , G10L21/0208 , G10L25/30 , G10L13/02 , G06N3/08
Abstract: Disclosed are a method for coding a residual signal of LPC coefficients based on collaborative quantization and a computing device for performing the method. The residual signal coding method includes: generating encoded LPC coefficients and LPC residual signals by performing LPC analysis and quantization on an input speech; Determining a predicted LPC residual signal by applying the LPC residual signal to cross module residual learning; Performing LPC synthesis using the coded LPC coefficients and the predicted LPC residual signal; It may include the step of determining an output speech that is a synthesized output according to a result of performing the LPC synthesis.
-
公开(公告)号:US20210005209A1
公开(公告)日:2021-01-07
申请号:US16814103
申请日:2020-03-10
Applicant: Electronics and Telecommunications Research Institute , Kwangwoon University Industry-Academic Collaboration Foundation
Inventor: Seung Kwon BEACK , Jongmo SUNG , Mi Suk LEE , Tae Jin LEE , Hochong PARK
IPC: G10L19/02 , G10L19/032 , G10L21/038 , G06N3/04
Abstract: Disclosed are a method of encoding a high band of an audio, a method of decoding a high band of an audio, and an encoder and a decoder for performing the methods. The method of decoding a high band of an audio, the method performed by a decoder, includes identifying a parameter extracted through a first neural network, identifying side information extracted through a second neural network, and restoring a high band of an audio by applying the parameter and the side information to a third neural network.
-
公开(公告)号:US20190289413A1
公开(公告)日:2019-09-19
申请号:US16357180
申请日:2019-03-18
Inventor: Seung Kwon BEACK , Jeong Il SEO , Jong Mo SUNG , Tae Jin LEE , Dae Young JANG , Jin Woong KIM
IPC: H04S3/00 , G10L19/008 , G10L19/20
Abstract: Disclosed are a multi-channel audio signal processing method and a multi-channel audio signal processing apparatus. The multi-channel audio signal processing method may generate N channel output signals from N/2 channel downmix signals based on an N−N/2−N structure.
-
公开(公告)号:US20190035412A1
公开(公告)日:2019-01-31
申请号:US16081169
申请日:2017-03-21
Inventor: Seung Kwon BEACK , Tae Jin LEE , Jongmo SUNG , Mi Suk LEE , Dae Young JANG , Jin Soo CHOI
IPC: G10L19/022 , G10L19/02 , G10L19/032 , G10L19/04
Abstract: Provided is an apparatus and method for encoding/decoding audio based on a block. A method of encoding an audio signal may include dividing each of frame of input signal that constitute an audio signal into a plurality of subframes; transforming the subframes to a frequency domain; determining a two-dimensional (2D) intra block using the subframes transformed to the frequency domain; and encoding the 2D intra block. The 2D intra block may be a block that two-dimensionally displays frequency coefficients of the subframes transformed to the frequency domain using a time and a frequency.
-
50.
公开(公告)号:US20180035230A1
公开(公告)日:2018-02-01
申请号:US15551734
申请日:2016-02-17
Inventor: Seung Kwon BEACK , Jeong Il SEO , Jong Mo SUNG , Tae Jin LEE , Dae Young JANG , Jin Soo CHOI
IPC: H04S1/00 , H04S5/00 , G10L19/008
CPC classification number: H04S1/007 , G10L19/008 , H04S3/008 , H04S3/02 , H04S5/00 , H04S2400/01 , H04S2400/03 , H04S2400/07 , H04S2420/03
Abstract: Provided are an encoding method of a multichannel signal, an encoding apparatus to perform the encoding method, a multichannel signal processing method, and a decoding apparatus to perform the decoding method. The decoding method may include identifying an N/2-channel downmix signal derived from an N-channel input signal; and generating an N-channel output signal from the identified N/2-channel downmix signal using a plurality of one-to-two (OTT) boxes. If a low frequency effect (LFE) channel is absent in the output signal, the number of OTT boxes may be equal to N/2 where N/2 denotes the number of channels of the downmix signal.
-
-
-
-
-
-
-
-
-