-
公开(公告)号:US20220358940A1
公开(公告)日:2022-11-10
申请号:US17527351
申请日:2021-11-16
Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , Gwangju Institute of Science and Technology
Inventor: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Inseon JANG , Jong Won SHIN , Soojoong HWANG , Youngju CHEON , Sangwook HAN
Abstract: Disclosed are methods of encoding and decoding an audio signal using side information, and an encoder and a decoder for performing the methods. The method of encoding an audio signal using side information includes identifying an input signal, the input signal being an original audio signal, extracting side information from the input signal using a learning model trained to extract side information from a feature vector of the input signal, encoding the input signal, and generating a bitstream by combining the encoded input signal and the side information.
-
62.
公开(公告)号:US20210398547A1
公开(公告)日:2021-12-23
申请号:US17331416
申请日:2021-05-26
Inventor: Seung Kwon BEACK , Jongmo SUNG , Mi Suk LEE , Tae Jin LEE , Woo-taek LIM , Inseon JANG
IPC: G10L19/035 , G10L19/022 , G10L19/06 , G10L19/16
Abstract: An audio signal encoding method performed by an encoder includes identifying an audio signal of a time domain in units of a block, generating a combined block by combining i) a current original block of the audio signal and ii) a previous original block chronologically adjacent to the current original block, extracting a first residual signal of a frequency domain from the combined block using linear predictive coding of a time domain, overlapping chronologically adjacent first residual signals among first residual signals converted into a time domain, and quantizing a second residual signal of a time domain extracted from the overlapped first residual signal by converting the second residual signal of the time domain into a frequency domain using linear predictive coding of a frequency domain.
-
63.
公开(公告)号:US20210350796A1
公开(公告)日:2021-11-11
申请号:US17308800
申请日:2021-05-05
Applicant: Electronics and Telecommunications Research Institute , The Trustees of Indiana University
Inventor: Minje KIM , Mi Suk LEE , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Jin Soo CHOI , Kai ZHEN
Abstract: Disclosed is a speech processing apparatus and method using a densely connected hybrid neural network. The speech processing method includes inputting a time domain sample of N*1 dimension for an input speech into a densely connected hybrid network; passing the time domain sample through a plurality of dense blocks in a densely connected hybrid network; reshaping the time domain samples into M subframes by passing the time domain samples through the plurality of dense blocks, inputting the M subframes into gated recurrent unit (GRU) components of N/M-dimension; outputting clean speech from which noise is removed from the input speech by passing the M subframes through GRU components.
-
64.
公开(公告)号:US20210174815A1
公开(公告)日:2021-06-10
申请号:US17112480
申请日:2020-12-04
Inventor: Seung Kwon BEACK , Jooyoung LEE , Jongmo SUNG , Mi Suk LEE , Tae Jin LEE , Woo-taek LIM , Seunghyun CHO , Jin Soo CHOI
IPC: G10L19/038 , G10L25/30 , G10L19/028 , G10L19/24 , G06N3/02
Abstract: Disclosed are a quantizing method for a latent vector and a computing device for performing the quantization method. A quantizing method of a latent vector includes performing information shaping on the latent vector resulting from reduction in a dimension of an input signal using a target neural network; clamping a residual signal of the latent vector derived based on the information shaping; performing resealing on the clamped residual signal; and performing quantization on the resealed residual signal.
-
65.
公开(公告)号:US20210166706A1
公开(公告)日:2021-06-03
申请号:US17105835
申请日:2020-11-27
Inventor: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Mi Suk LEE , Tae Jin LEE
IPC: G10L19/16 , G10L19/038 , G10L25/30 , G06N3/08
Abstract: Disclosed is an apparatus and method for encoding/decoding an audio signal using information of a previous frame. An audio signal encoding method includes: generating a current latent vector by reducing dimension of a current frame of an audio signal; generating a concatenation vector by concatenating a previous latent vector generated by reducing dimension of a previous frame of the audio signal with the current latent vector; and encoding and quantizing the concatenation vector.
-
公开(公告)号:US20210074306A1
公开(公告)日:2021-03-11
申请号:US17017413
申请日:2020-09-10
Inventor: Jongmo SUNG , Seung Kwon BEACK , Mi Suk LEE , Tae Jin LEE , Woo-taek LIM , Jin Soo CHOI
IPC: G10L19/032 , H03M7/30 , G06N3/08 , G06N5/04
Abstract: Provided are an audio encoding method, an audio decoding method, an audio encoding apparatus, and an audio decoding apparatus using dynamic model parameters. The audio encoding method using dynamic model parameters may use dynamic model parameters corresponding to each of the levels of the encoding network when reducing the dimension of an audio signal in the encoding network. In addition, the audio decoding method using the dynamic model parameter may use a dynamic model parameter corresponding to each of the levels of the decoding network when extending the dimension of an audio signal in an encoding network.
-
67.
公开(公告)号:US20200243099A1
公开(公告)日:2020-07-30
申请号:US16846272
申请日:2020-04-10
Inventor: Seung Kwon BEACK , Tae Jin LEE , Min Je KIM , Kyeongok KANG , Dae Young JANG , Jin Woo HONG , Jeongil SEO , Chieteuk AHN , Hochong PARK , Young-Cheol PARK
IPC: G10L19/087 , G10L19/26 , G10L19/125 , G10L19/22
Abstract: Disclosed is an LPC residual signal encoding/decoding apparatus of an MDCT based unified voice and audio encoding device. The LPC residual signal encoding apparatus analyzes a property of an input signal, selects an encoding method of an LPC filtered signal, and encode the LPC residual signal based on one of a real filterbank, a complex filterbank, and an algebraic code excited linear prediction (ACELP).
-
68.
公开(公告)号:US20200176002A1
公开(公告)日:2020-06-04
申请号:US16786817
申请日:2020-02-10
Inventor: Seung Kwon BEACK , Tae Jin LEE , Jong Mo SUNG , Jeong Il SEO , Kyeong Ok KANG , Dae Young JANG , Jin Woong KIM
IPC: G10L19/008 , H04S3/00
Abstract: An encoder and an encoding method for a multi-channel signal, and a decoder and a decoding method for a multi-channel signal are disclosed. A multi-channel signal may be efficiently processed by consecutive downmixing or upmixing.
-
69.
公开(公告)号:US20190215631A1
公开(公告)日:2019-07-11
申请号:US16354890
申请日:2019-03-15
Inventor: Seung Kwon BEACK , Tae Jin LEE , Jong Mo SUNG , Kyeong Ok KANG , Jeong Il SEO , Dae Young JANG , Yong Ju LEE , Jin Woong KIM
IPC: H04S3/00
CPC classification number: H04S3/008 , G10L19/008 , H04S2420/03
Abstract: An audio encoding apparatus and method that encodes hybrid contents including an object sound, a background sound, and metadata, and an audio decoding apparatus and method that decodes the encoded hybrid contents are provided. The audio encoding apparatus may include a mixing unit to generate an intermediate channel signal by mixing a background sound and an object sound, a matrix information encoding unit to encode matrix information used for the mixing, an audio encoding unit to encode the intermediate channel signal, and a metadata encoding unit to encode metadata including control information of the object sound.
-
70.
公开(公告)号:US20190200150A1
公开(公告)日:2019-06-27
申请号:US16290469
申请日:2019-03-01
Inventor: Seung Kwon BEACK , Jeong Il SEO , Jong Mo SUNG , Tae Jin LEE , Dae Young JANG , Jin Soo CHOI
IPC: H04S1/00 , H04S3/00 , G10L19/008 , H04S5/00
CPC classification number: H04S1/007 , G10L19/008 , H04S3/008 , H04S3/02 , H04S5/00 , H04S2400/01 , H04S2400/03 , H04S2400/07 , H04S2420/03
Abstract: Provided are an encoding method of a multichannel signal, an encoding apparatus to perform the encoding method, a multichannel signal processing method, and a decoding apparatus to perform the decoding method. The decoding method may include identifying an N/2-channel downmix signal derived from an N-channel input signal; and generating an N-channel output signal from the identified N/2-channel downmix signal using a plurality of one-to-two (OTT) boxes. If a low frequency effect (LFE) channel is absent in the output signal, the number of OTT boxes may be equal to N/2 where N/2 denotes the number of channels of the downmix signal.
-
-
-
-
-
-
-
-
-