-
公开(公告)号:US20230224661A1
公开(公告)日:2023-07-13
申请号:US17713059
申请日:2022-04-04
发明人: Yong Ju LEE , Jae-hyoun YOO , Dae Young JANG , Kyeongok KANG , Tae Jin LEE
CPC分类号: H04S7/302 , H04S5/005 , H04S2400/11 , H04S2420/01
摘要: A method and apparatus for rendering an object-based audio signal considering an obstacle are disclosed. A method for rendering an object-based audio signal according to an example embodiment, the method includes identifying an object-based input signal and metadata for the input signal, generating a binaural filter based on the metadata using a binaural room impulse response (BRIR), determining, based on the metadata, whether an obstacle is present between a listener and an object, modifying the generated binaural filter when it is determined that the obstacle is present, and generating a rendered output signal by convolving the modified binaural filter and the input signal.
-
公开(公告)号:US20230112342A1
公开(公告)日:2023-04-13
申请号:US17582209
申请日:2022-01-24
发明人: Yong Ju LEE , Jae-hyoun YOO , Dae Young JANG , Kyeongok KANG , Tae Jin LEE
摘要: An apparatus and method for pitch-shifting an audio signal with low complexity are disclosed. The method includes identifying a distance between an audio object included in the audio signal and a listener, checking whether the distance between the audio object and the listener decreases, and performing stepwise stretching pitch-shifting of repeatedly using at least one of frequency components of the audio signal when the distance between the audio object and the listener decreases.
-
公开(公告)号:US20220358940A1
公开(公告)日:2022-11-10
申请号:US17527351
申请日:2021-11-16
申请人: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , Gwangju Institute of Science and Technology
发明人: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Inseon JANG , Jong Won SHIN , Soojoong HWANG , Youngju CHEON , Sangwook HAN
摘要: Disclosed are methods of encoding and decoding an audio signal using side information, and an encoder and a decoder for performing the methods. The method of encoding an audio signal using side information includes identifying an input signal, the input signal being an original audio signal, extracting side information from the input signal using a learning model trained to extract side information from a feature vector of the input signal, encoding the input signal, and generating a bitstream by combining the encoded input signal and the side information.
-
公开(公告)号:US20220216881A1
公开(公告)日:2022-07-07
申请号:US17484284
申请日:2021-09-24
发明人: Young Ho JEONG , Soo Young PARK , Tae Jin LEE
摘要: Disclosed are a training method for a learning model for recognizing an acoustic signal, a method of recognizing an acoustic signal using the learning model, and devices for performing the methods. The method of recognizing an acoustic signal using a learning model includes identifying an acoustic signal including an acoustic event or acoustic scene, determining an acoustic feature of the acoustic signal, dividing the determined acoustic feature for each of a plurality of frequency band intervals, and determining the acoustic event or acoustic scene included in the acoustic signal by inputting the divided acoustic features to a trained learning model.
-
75.
公开(公告)号:US20210398547A1
公开(公告)日:2021-12-23
申请号:US17331416
申请日:2021-05-26
发明人: Seung Kwon BEACK , Jongmo SUNG , Mi Suk LEE , Tae Jin LEE , Woo-taek LIM , Inseon JANG
IPC分类号: G10L19/035 , G10L19/022 , G10L19/06 , G10L19/16
摘要: An audio signal encoding method performed by an encoder includes identifying an audio signal of a time domain in units of a block, generating a combined block by combining i) a current original block of the audio signal and ii) a previous original block chronologically adjacent to the current original block, extracting a first residual signal of a frequency domain from the combined block using linear predictive coding of a time domain, overlapping chronologically adjacent first residual signals among first residual signals converted into a time domain, and quantizing a second residual signal of a time domain extracted from the overlapped first residual signal by converting the second residual signal of the time domain into a frequency domain using linear predictive coding of a frequency domain.
-
76.
公开(公告)号:US20210390967A1
公开(公告)日:2021-12-16
申请号:US17242828
申请日:2021-04-28
发明人: Seung Kwon Beack , Jongmo SUNG , Mi Suk LEE , Tae Jin LEE , Woo-taek LIM , Inseon JANG , Jin Soo CHOI
IPC分类号: G10L19/032 , G10L19/08
摘要: Disclosed is a method of encoding and decoding an audio signal using linear predictive coding (LPC) and an encoder and a decoder that perform the method. The method of encoding an audio signal to be performed by the encoder includes identifying a time-domain audio signal block-wise, quantizing a linear prediction coefficient obtained from a block of the audio signal through the LPC, generating an envelope based on the quantized linear prediction coefficient, extracting a residual signal based on the envelope and a result of converting the block into a frequency domain, grouping the residual signal by each sub-band and determining a scale factor for quantizing the grouped residual signal, quantizing the residual signal using the scale factor, and converting the quantized residual signal and the quantized linear prediction coefficient into a bitstream and transmitting the bitstream to a decoder.
-
77.
公开(公告)号:US20210350796A1
公开(公告)日:2021-11-11
申请号:US17308800
申请日:2021-05-05
发明人: Minje KIM , Mi Suk LEE , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Jin Soo CHOI , Kai ZHEN
摘要: Disclosed is a speech processing apparatus and method using a densely connected hybrid neural network. The speech processing method includes inputting a time domain sample of N*1 dimension for an input speech into a densely connected hybrid network; passing the time domain sample through a plurality of dense blocks in a densely connected hybrid network; reshaping the time domain samples into M subframes by passing the time domain samples through the plurality of dense blocks, inputting the M subframes into gated recurrent unit (GRU) components of N/M-dimension; outputting clean speech from which noise is removed from the input speech by passing the M subframes through GRU components.
-
78.
公开(公告)号:US20210174815A1
公开(公告)日:2021-06-10
申请号:US17112480
申请日:2020-12-04
发明人: Seung Kwon BEACK , Jooyoung LEE , Jongmo SUNG , Mi Suk LEE , Tae Jin LEE , Woo-taek LIM , Seunghyun CHO , Jin Soo CHOI
IPC分类号: G10L19/038 , G10L25/30 , G10L19/028 , G10L19/24 , G06N3/02
摘要: Disclosed are a quantizing method for a latent vector and a computing device for performing the quantization method. A quantizing method of a latent vector includes performing information shaping on the latent vector resulting from reduction in a dimension of an input signal using a target neural network; clamping a residual signal of the latent vector derived based on the information shaping; performing resealing on the clamped residual signal; and performing quantization on the resealed residual signal.
-
79.
公开(公告)号:US20210166706A1
公开(公告)日:2021-06-03
申请号:US17105835
申请日:2020-11-27
发明人: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Mi Suk LEE , Tae Jin LEE
IPC分类号: G10L19/16 , G10L19/038 , G10L25/30 , G06N3/08
摘要: Disclosed is an apparatus and method for encoding/decoding an audio signal using information of a previous frame. An audio signal encoding method includes: generating a current latent vector by reducing dimension of a current frame of an audio signal; generating a concatenation vector by concatenating a previous latent vector generated by reducing dimension of a previous frame of the audio signal with the current latent vector; and encoding and quantizing the concatenation vector.
-
公开(公告)号:US20210074306A1
公开(公告)日:2021-03-11
申请号:US17017413
申请日:2020-09-10
发明人: Jongmo SUNG , Seung Kwon BEACK , Mi Suk LEE , Tae Jin LEE , Woo-taek LIM , Jin Soo CHOI
IPC分类号: G10L19/032 , H03M7/30 , G06N3/08 , G06N5/04
摘要: Provided are an audio encoding method, an audio decoding method, an audio encoding apparatus, and an audio decoding apparatus using dynamic model parameters. The audio encoding method using dynamic model parameters may use dynamic model parameters corresponding to each of the levels of the encoding network when reducing the dimension of an audio signal in the encoding network. In addition, the audio decoding method using the dynamic model parameter may use a dynamic model parameter corresponding to each of the levels of the decoding network when extending the dimension of an audio signal in an encoding network.
-
-
-
-
-
-
-
-
-