-
81.
公开(公告)号:US20210390967A1
公开(公告)日:2021-12-16
申请号:US17242828
申请日:2021-04-28
Inventor: Seung Kwon Beack , Jongmo SUNG , Mi Suk LEE , Tae Jin LEE , Woo-taek LIM , Inseon JANG , Jin Soo CHOI
IPC: G10L19/032 , G10L19/08
Abstract: Disclosed is a method of encoding and decoding an audio signal using linear predictive coding (LPC) and an encoder and a decoder that perform the method. The method of encoding an audio signal to be performed by the encoder includes identifying a time-domain audio signal block-wise, quantizing a linear prediction coefficient obtained from a block of the audio signal through the LPC, generating an envelope based on the quantized linear prediction coefficient, extracting a residual signal based on the envelope and a result of converting the block into a frequency domain, grouping the residual signal by each sub-band and determining a scale factor for quantizing the grouped residual signal, quantizing the residual signal using the scale factor, and converting the quantized residual signal and the quantized linear prediction coefficient into a bitstream and transmitting the bitstream to a decoder.
-
82.
公开(公告)号:US20210350796A1
公开(公告)日:2021-11-11
申请号:US17308800
申请日:2021-05-05
Applicant: Electronics and Telecommunications Research Institute , The Trustees of Indiana University
Inventor: Minje KIM , Mi Suk LEE , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Jin Soo CHOI , Kai ZHEN
Abstract: Disclosed is a speech processing apparatus and method using a densely connected hybrid neural network. The speech processing method includes inputting a time domain sample of N*1 dimension for an input speech into a densely connected hybrid network; passing the time domain sample through a plurality of dense blocks in a densely connected hybrid network; reshaping the time domain samples into M subframes by passing the time domain samples through the plurality of dense blocks, inputting the M subframes into gated recurrent unit (GRU) components of N/M-dimension; outputting clean speech from which noise is removed from the input speech by passing the M subframes through GRU components.
-
83.
公开(公告)号:US20210174815A1
公开(公告)日:2021-06-10
申请号:US17112480
申请日:2020-12-04
Inventor: Seung Kwon BEACK , Jooyoung LEE , Jongmo SUNG , Mi Suk LEE , Tae Jin LEE , Woo-taek LIM , Seunghyun CHO , Jin Soo CHOI
IPC: G10L19/038 , G10L25/30 , G10L19/028 , G10L19/24 , G06N3/02
Abstract: Disclosed are a quantizing method for a latent vector and a computing device for performing the quantization method. A quantizing method of a latent vector includes performing information shaping on the latent vector resulting from reduction in a dimension of an input signal using a target neural network; clamping a residual signal of the latent vector derived based on the information shaping; performing resealing on the clamped residual signal; and performing quantization on the resealed residual signal.
-
84.
公开(公告)号:US20210166706A1
公开(公告)日:2021-06-03
申请号:US17105835
申请日:2020-11-27
Inventor: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Mi Suk LEE , Tae Jin LEE
IPC: G10L19/16 , G10L19/038 , G10L25/30 , G06N3/08
Abstract: Disclosed is an apparatus and method for encoding/decoding an audio signal using information of a previous frame. An audio signal encoding method includes: generating a current latent vector by reducing dimension of a current frame of an audio signal; generating a concatenation vector by concatenating a previous latent vector generated by reducing dimension of a previous frame of the audio signal with the current latent vector; and encoding and quantizing the concatenation vector.
-
公开(公告)号:US20210074306A1
公开(公告)日:2021-03-11
申请号:US17017413
申请日:2020-09-10
Inventor: Jongmo SUNG , Seung Kwon BEACK , Mi Suk LEE , Tae Jin LEE , Woo-taek LIM , Jin Soo CHOI
IPC: G10L19/032 , H03M7/30 , G06N3/08 , G06N5/04
Abstract: Provided are an audio encoding method, an audio decoding method, an audio encoding apparatus, and an audio decoding apparatus using dynamic model parameters. The audio encoding method using dynamic model parameters may use dynamic model parameters corresponding to each of the levels of the encoding network when reducing the dimension of an audio signal in the encoding network. In addition, the audio decoding method using the dynamic model parameter may use a dynamic model parameter corresponding to each of the levels of the decoding network when extending the dimension of an audio signal in an encoding network.
-
公开(公告)号:US20200349958A1
公开(公告)日:2020-11-05
申请号:US16925946
申请日:2020-07-10
Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , Kwangwoon University Industry-Academic Collaboration Foundation
Inventor: Tae Jin LEE , Seung-Kwon BAEK , Min Je KIM , Dae Young JANG , Jeongil SEO , Kyeongok KANG , Jin-Woo HONG , Hochong PARK , Young-Cheol PARK
IPC: G10L19/008 , G10L19/02 , G10L19/04 , G10L19/20 , G10L19/12
Abstract: Provided is an encoding apparatus for integrally encoding and decoding a speech signal and a audio signal, and may include: an input signal analyzer to analyze a characteristic of an input signal; a stereo encoder to down mix the input signal to a mono signal when the input signal is a stereo signal, and to extract stereo sound image information; a frequency band expander to expand a frequency band of the input signal; a sampling rate converter to convert a sampling rate ; a speech signal encoder to encode the input signal using a speech encoding module when the input signal is a speech characteristics signal; a audio signal encoder to encode the input signal using a audio encoding module when the input signal is a audio characteristic signal; and a bitstream generator to generate a bitstream.
-
87.
公开(公告)号:US20200302949A1
公开(公告)日:2020-09-24
申请号:US16562110
申请日:2019-09-05
Inventor: Young Ho JEONG , Sang Won SUH , Tae Jin LEE , Woo-taek LIM , Hui Yong KIM
Abstract: Provided is a sound event recognition method that may improve a sound event recognition performance using a correlation between difference sound signal feature parameters based on a neural network, in detail, that may extract a sound signal feature parameter from a sound signal including a sound event, and recognize the sound event included in the sound signal by applying a convolutional neural network (CNN) trained using the sound signal feature parameter.
-
88.
公开(公告)号:US20200243099A1
公开(公告)日:2020-07-30
申请号:US16846272
申请日:2020-04-10
Inventor: Seung Kwon BEACK , Tae Jin LEE , Min Je KIM , Kyeongok KANG , Dae Young JANG , Jin Woo HONG , Jeongil SEO , Chieteuk AHN , Hochong PARK , Young-Cheol PARK
IPC: G10L19/087 , G10L19/26 , G10L19/125 , G10L19/22
Abstract: Disclosed is an LPC residual signal encoding/decoding apparatus of an MDCT based unified voice and audio encoding device. The LPC residual signal encoding apparatus analyzes a property of an input signal, selects an encoding method of an LPC filtered signal, and encode the LPC residual signal based on one of a real filterbank, a complex filterbank, and an algebraic code excited linear prediction (ACELP).
-
公开(公告)号:US20200227060A1
公开(公告)日:2020-07-16
申请号:US16835728
申请日:2020-03-31
Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , KWANGWOON UNIVERSITY INDUSTRY-ACADEMIC COLLABORATION FOUNDATION
Inventor: Seungkwon BEACK , Tae Jin LEE , Min Je KIM , Kyeongok KANG , Dae Young JANG , Jeongil SEO , Jin Woo HONG , Chieteuk AHN , Ho Chong PARK , Young-cheol PARK
IPC: G10L19/22 , G10L19/022 , G10L19/06 , G10L19/18
Abstract: A Unified Speech and Audio Codec (USAC) that may process a window sequence based on mode switching is provided. The USAC may perform encoding or decoding by overlapping between frames based on a folding point when mode switching occurs. The USAC may process different window sequences for each situation to perform encoding or decoding, and thereby may improve a coding efficiency.
-
90.
公开(公告)号:US20200176002A1
公开(公告)日:2020-06-04
申请号:US16786817
申请日:2020-02-10
Inventor: Seung Kwon BEACK , Tae Jin LEE , Jong Mo SUNG , Jeong Il SEO , Kyeong Ok KANG , Dae Young JANG , Jin Woong KIM
IPC: G10L19/008 , H04S3/00
Abstract: An encoder and an encoding method for a multi-channel signal, and a decoder and a decoding method for a multi-channel signal are disclosed. A multi-channel signal may be efficiently processed by consecutive downmixing or upmixing.
-
-
-
-
-
-
-
-
-