-
公开(公告)号:US11862183B2
公开(公告)日:2024-01-02
申请号:US17368390
申请日:2021-07-06
发明人: Jongmo Sung , Seung Kwon Beack , Mi Suk Lee , Tae Jin Lee , Woo-taek Lim , Inseon Jang
IPC分类号: G10L19/032
CPC分类号: G10L19/032
摘要: An audio signal encoding and decoding method using a neural network model, a method of training the neural network model, and an encoder and decoder performing the methods are disclosed. The encoding method includes computing the first feature information of an input signal using a recurrent encoding model, computing an output signal from the first feature information using a recurrent decoding model, calculating a residual signal by subtracting the output signal from the input signal, computing the second feature information of the residual signal using a nonrecurrent encoding model, and converting the first feature information and the second feature information to a bitstream.
-
公开(公告)号:US11790926B2
公开(公告)日:2023-10-17
申请号:US17156006
申请日:2021-01-22
发明人: Mi Suk Lee , Seung Kwon Beack , Jongmo Sung , Tae Jin Lee , Jin Soo Choi , Minje Kim , Kai Zhen
IPC分类号: G10L19/038 , G10L19/028 , G10L25/18 , G10L25/21 , G10L25/30
CPC分类号: G10L19/038 , G10L19/028 , G10L25/18 , G10L25/21 , G10L25/30
摘要: A method and apparatus for processing an audio signal are disclosed. According to an example embodiment, a method of processing an audio signal may include acquiring a final audio signal for an initial audio signal using a plurality of neural network models generating output audio signals by encoding and decoding input audio signals, calculating a difference between the initial audio signal and the final audio signal in a time domain, converting the initial audio signal and the final audio signal into Mel-spectra, calculating a difference between the Mel-spectra of the initial audio signal and the final audio signal in a frequency domain, training the plurality of neural network models based on results calculated in the time domain and the frequency domain, and generating a new final audio signal distinguished from the final audio signal from the initial audio signal using the trained neural network models.
-
公开(公告)号:US11682402B2
公开(公告)日:2023-06-20
申请号:US17201943
申请日:2021-03-15
发明人: Yong Ju Lee , Jeong Il Seo , Jae Hyoun Yoo , Seung Kwon Beack , Jong Mo Sung , Tae Jin Lee , Kyeong Ok Kang , Jin Woong Kim , Tae Jin Park , Dae Young Jang , Keun Woo Choi
IPC分类号: G10L19/008 , H04S7/00
CPC分类号: G10L19/008 , H04S7/00 , H04S7/30 , H04S2400/01 , H04S2400/03
摘要: Disclosed is a binaural rendering method and apparatus for decoding a multichannel audio signal. The binaural rendering method may include: extracting an early reflection component and a late reverberation component from a binaural filter; generating a stereo audio signal by performing binaural rendering of a multichannel audio signal base on the early reflection component; and applying the late reverberation component to the generated stereo audio signal.
-
公开(公告)号:US11664037B2
公开(公告)日:2023-05-30
申请号:US17326035
申请日:2021-05-20
发明人: Woo-taek Lim , Seung Kwon Beack , Jongmo Sung , Mi Suk Lee , Tae Jin Lee , Inseon Jang , Minje Kim , Haici Yang
IPC分类号: G10L19/032 , G10L21/0272
CPC分类号: G10L19/032 , G10L21/0272
摘要: Methods of encoding and decoding a speech signal using a neural network model that recognizes sound sources, and encoding and decoding apparatuses for performing the methods are provided. A method of encoding a speech signal includes identifying an input signal for a plurality of sound sources; generating a latent signal by encoding the input signal; obtaining a plurality of sound source signals by separating the latent signal for each of the plurality of sound sources; determining a number of bits used for quantization of each of the plurality of sound source signals according to a type of each of the plurality of sound sources; quantizing each of the plurality of sound source signals based on the determined number of bits; and generating a bitstream by combining the plurality of quantized sound source signals.
-
公开(公告)号:US11508386B2
公开(公告)日:2022-11-22
申请号:US16843649
申请日:2020-04-08
申请人: Electronics and Telecommunications Research Institute , Kwangwoon University Industry-Academic Collaboration Foundation
发明人: Hochong Park , Seung Kwon Beack , Jongmo Sung , Seong-Hyeon Shin , Mi Suk Lee , Tae Jin Lee , Jin Soo Choi
摘要: An inventive concept relates to an audio coding method to which CNN-based frequency spectrum recovery is applied. An inventive concept transmits a part of frequency spectral coefficients generated in transform coding to a decoder and the decoder recovers the frequency spectral coefficient not transmitted. Furthermore, the signs of frequency spectral coefficient are transmitted from an encoder to the decoder depending on a sign transmission rule.
-
公开(公告)号:US11405738B2
公开(公告)日:2022-08-02
申请号:US16703226
申请日:2019-12-04
发明人: Yong Ju Lee , Jeong Il Seo , Seung Kwon Beack , Kyeong Ok Kang , Jin Woong Kim , Jae Hyoun Yoo
IPC分类号: H04S3/00 , G10L19/008
摘要: Disclosed is an apparatus and method for processing a multichannel audio signal. A multichannel audio signal processing method may include: generating an N-channel audio signal of N channels by down-mixing an M-channel audio signal of M channels; and generating a stereo audio signal by performing binaural rendering of the N-channel audio signal.
-
公开(公告)号:US11310615B2
公开(公告)日:2022-04-19
申请号:US16747372
申请日:2020-01-20
发明人: Seung Kwon Beack , Tae Jin Lee , Jong Mo Sung , Kyeong Ok Kang , Jeong Il Seo , Dae Young Jang , Yong Ju Lee , Jin Woong Kim
IPC分类号: H04S3/00 , G10L19/008
摘要: An audio encoding apparatus and method that encodes hybrid contents including an object sound, a background sound, and metadata, and an audio decoding apparatus and method that decodes the encoded hybrid contents are provided. The audio encoding apparatus may include a mixing unit to generate an intermediate channel signal by mixing a background sound and an object sound, a matrix information encoding unit to encode matrix information used for the mixing, an audio encoding unit to encode the intermediate channel signal, and a metadata encoding unit to encode metadata including control information of the object sound.
-
公开(公告)号:US11062718B2
公开(公告)日:2021-07-13
申请号:US15714273
申请日:2017-09-25
申请人: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , KWANGWOON UNIVERSITY INDUSTRY-ACADEMIC COLLABORATION FOUNDATION
发明人: Seung Kwon Beack , Tae Jin Lee , Min Je Kim , Dae Young Jang , Kyeongok Kang , Jin Woo Hong , Ho Chong Park , Young-cheol Park
IPC分类号: G10L19/022 , G10L19/16 , G10L19/20 , G10L19/02
摘要: An encoding apparatus and a decoding apparatus in a transform between a Modified Discrete Cosine Transform (MDCT)-based coder and a different coder are provided. The encoding apparatus may encode additional information to restore an input signal encoded according to the MDCT-based coding scheme, when switching occurs between the MDCT-based coder and the different coder. Accordingly, an unnecessary bitstream may be prevented from being generated, and minimum additional information may be encoded.
-
公开(公告)号:US11056122B2
公开(公告)日:2021-07-06
申请号:US16786817
申请日:2020-02-10
发明人: Seung Kwon Beack , Tae Jin Lee , Jong Mo Sung , Jeong Il Seo , Kyeong Ok Kang , Dae Young Jang , Jin Woong Kim
IPC分类号: G10L19/008 , H04S3/00
摘要: An encoder and an encoding method for a multi-channel signal, and a decoder and a decoding method for a multi-channel signal are disclosed. A multi-channel signal may be efficiently processed by consecutive downmixing or upmixing.
-
公开(公告)号:US10511925B2
公开(公告)日:2019-12-17
申请号:US16126466
申请日:2018-09-10
发明人: Yong Ju Lee , Jeong Il Seo , Seung Kwon Beack , Kyeong Ok Kang , Jin Woong Kim , Jae Hyoun Yoo
IPC分类号: H04S3/00 , G10L19/008
摘要: Disclosed is an apparatus and method for processing a multichannel audio signal. A multichannel audio signal processing method may include: generating an N-channel audio signal of N channels by down-mixing an M-channel audio signal of M channels; and generating a stereo audio signal by performing binaural rendering of the N-channel audio signal.
-
-
-
-
-
-
-
-
-