-
公开(公告)号:US11330387B2
公开(公告)日:2022-05-10
申请号:US16647458
申请日:2019-10-01
申请人: Electronics and Telecommunications Research Institute , CHUNG ANG UNIVERSITY INDUSTRY ACADEMIC COOPERATION FOUNDATION
发明人: Dae Young Jang , Jae-hyoun Yoo , Yong Ju Lee , Tae Jin Lee , Sang Wook Kim
摘要: An audio signal controlling method includes identifying whether an audio zooming effect is used for at least one audio object present in a virtual reality (VR) through an audio zooming effect field included in metadata, and controlling an audio signal corresponding to the audio object based on a preset method when the audio zooming effect is identified as being used.
-
公开(公告)号:US11882425B2
公开(公告)日:2024-01-23
申请号:US17681429
申请日:2022-02-25
发明人: Dae Young Jang , Kyeongok Kang , Jae-hyoun Yoo , Yong Ju Lee , Tae Jin Lee
CPC分类号: H04S7/303 , H04S3/008 , H04S2400/01 , H04S2400/03 , H04S2400/13
摘要: A method and apparatus for rendering a volume sound source are disclosed. The method of rendering a volume sound source may include identifying information about a listener and information about the volume sound source, determining a corresponding area in which a source element is disposed in the volume sound source in consideration of the information about the listener, determining an angle between the listener and the corresponding area based on the information about the listener and the information about the volume sound source, determining a number of source elements disposed in the corresponding area according to the angle, determining a position and a gain of the source element using i) the number of source elements and ii) a distance between the listener and the volume sound source, and rendering the volume sound source according to the position and the gain of the source element.
-
公开(公告)号:US11862183B2
公开(公告)日:2024-01-02
申请号:US17368390
申请日:2021-07-06
发明人: Jongmo Sung , Seung Kwon Beack , Mi Suk Lee , Tae Jin Lee , Woo-taek Lim , Inseon Jang
IPC分类号: G10L19/032
CPC分类号: G10L19/032
摘要: An audio signal encoding and decoding method using a neural network model, a method of training the neural network model, and an encoder and decoder performing the methods are disclosed. The encoding method includes computing the first feature information of an input signal using a recurrent encoding model, computing an output signal from the first feature information using a recurrent decoding model, calculating a residual signal by subtracting the output signal from the input signal, computing the second feature information of the residual signal using a nonrecurrent encoding model, and converting the first feature information and the second feature information to a bitstream.
-
公开(公告)号:US11790926B2
公开(公告)日:2023-10-17
申请号:US17156006
申请日:2021-01-22
发明人: Mi Suk Lee , Seung Kwon Beack , Jongmo Sung , Tae Jin Lee , Jin Soo Choi , Minje Kim , Kai Zhen
IPC分类号: G10L19/038 , G10L19/028 , G10L25/18 , G10L25/21 , G10L25/30
CPC分类号: G10L19/038 , G10L19/028 , G10L25/18 , G10L25/21 , G10L25/30
摘要: A method and apparatus for processing an audio signal are disclosed. According to an example embodiment, a method of processing an audio signal may include acquiring a final audio signal for an initial audio signal using a plurality of neural network models generating output audio signals by encoding and decoding input audio signals, calculating a difference between the initial audio signal and the final audio signal in a time domain, converting the initial audio signal and the final audio signal into Mel-spectra, calculating a difference between the Mel-spectra of the initial audio signal and the final audio signal in a frequency domain, training the plurality of neural network models based on results calculated in the time domain and the frequency domain, and generating a new final audio signal distinguished from the final audio signal from the initial audio signal using the trained neural network models.
-
公开(公告)号:US11705137B2
公开(公告)日:2023-07-18
申请号:US16925946
申请日:2020-07-10
申请人: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , Kwangwoon University Industry-Academic Collaboration Foundation
发明人: Tae Jin Lee , Seung-Kwon Baek , Min Je Kim , Dae Young Jang , Jeongil Seo , Kyeongok Kang , Jin-Woo Hong , Hochong Park , Young-Cheol Park
摘要: Provided is an encoding apparatus for integrally encoding and decoding a speech signal and a audio signal, and may include: an input signal analyzer to analyze a characteristic of an input signal; a stereo encoder to down mix the input signal to a mono signal when the input signal is a stereo signal, and to extract stereo sound image information; a frequency band expander to expand a frequency band of the input signal; a sampling rate converter to convert a sampling rate; a speech signal encoder to encode the input signal using a speech encoding module when the input signal is a speech characteristics signal; a audio signal encoder to encode the input signal using a audio encoding module when the input signal is a audio characteristic signal; and a bitstream generator to generate a bitstream.
-
公开(公告)号:US11682402B2
公开(公告)日:2023-06-20
申请号:US17201943
申请日:2021-03-15
发明人: Yong Ju Lee , Jeong Il Seo , Jae Hyoun Yoo , Seung Kwon Beack , Jong Mo Sung , Tae Jin Lee , Kyeong Ok Kang , Jin Woong Kim , Tae Jin Park , Dae Young Jang , Keun Woo Choi
IPC分类号: G10L19/008 , H04S7/00
CPC分类号: G10L19/008 , H04S7/00 , H04S7/30 , H04S2400/01 , H04S2400/03
摘要: Disclosed is a binaural rendering method and apparatus for decoding a multichannel audio signal. The binaural rendering method may include: extracting an early reflection component and a late reverberation component from a binaural filter; generating a stereo audio signal by performing binaural rendering of a multichannel audio signal base on the early reflection component; and applying the late reverberation component to the generated stereo audio signal.
-
公开(公告)号:US11664037B2
公开(公告)日:2023-05-30
申请号:US17326035
申请日:2021-05-20
发明人: Woo-taek Lim , Seung Kwon Beack , Jongmo Sung , Mi Suk Lee , Tae Jin Lee , Inseon Jang , Minje Kim , Haici Yang
IPC分类号: G10L19/032 , G10L21/0272
CPC分类号: G10L19/032 , G10L21/0272
摘要: Methods of encoding and decoding a speech signal using a neural network model that recognizes sound sources, and encoding and decoding apparatuses for performing the methods are provided. A method of encoding a speech signal includes identifying an input signal for a plurality of sound sources; generating a latent signal by encoding the input signal; obtaining a plurality of sound source signals by separating the latent signal for each of the plurality of sound sources; determining a number of bits used for quantization of each of the plurality of sound source signals according to a type of each of the plurality of sound sources; quantizing each of the plurality of sound source signals based on the determined number of bits; and generating a bitstream by combining the plurality of quantized sound source signals.
-
公开(公告)号:US11508386B2
公开(公告)日:2022-11-22
申请号:US16843649
申请日:2020-04-08
申请人: Electronics and Telecommunications Research Institute , Kwangwoon University Industry-Academic Collaboration Foundation
发明人: Hochong Park , Seung Kwon Beack , Jongmo Sung , Seong-Hyeon Shin , Mi Suk Lee , Tae Jin Lee , Jin Soo Choi
摘要: An inventive concept relates to an audio coding method to which CNN-based frequency spectrum recovery is applied. An inventive concept transmits a part of frequency spectral coefficients generated in transform coding to a decoder and the decoder recovers the frequency spectral coefficient not transmitted. Furthermore, the signs of frequency spectral coefficient are transmitted from an encoder to the decoder depending on a sign transmission rule.
-
公开(公告)号:US11310615B2
公开(公告)日:2022-04-19
申请号:US16747372
申请日:2020-01-20
发明人: Seung Kwon Beack , Tae Jin Lee , Jong Mo Sung , Kyeong Ok Kang , Jeong Il Seo , Dae Young Jang , Yong Ju Lee , Jin Woong Kim
IPC分类号: H04S3/00 , G10L19/008
摘要: An audio encoding apparatus and method that encodes hybrid contents including an object sound, a background sound, and metadata, and an audio decoding apparatus and method that decodes the encoded hybrid contents are provided. The audio encoding apparatus may include a mixing unit to generate an intermediate channel signal by mixing a background sound and an object sound, a matrix information encoding unit to encode matrix information used for the mixing, an audio encoding unit to encode the intermediate channel signal, and a metadata encoding unit to encode metadata including control information of the object sound.
-
10.
公开(公告)号:US11205442B2
公开(公告)日:2021-12-21
申请号:US16562110
申请日:2019-09-05
发明人: Young Ho Jeong , Sang Won Suh , Tae Jin Lee , Woo-taek Lim , Hui Yong Kim
摘要: Provided is a sound event recognition method that may improve a sound event recognition performance using a correlation between difference sound signal feature parameters based on a neural network, in detail, that may extract a sound signal feature parameter from a sound signal including a sound event, and recognize the sound event included in the sound signal by applying a convolutional neural network (CNN) trained using the sound signal feature parameter.
-
-
-
-
-
-
-
-
-