-
公开(公告)号:US20230335145A1
公开(公告)日:2023-10-19
申请号:US18118604
申请日:2023-03-07
申请人: Electronics and Telecommunications Research Institute , Kyungpook National University Industry-Academic Cooperation Foundation
发明人: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Inseon JANG , Min Han KIM , Seung Hyeon SHIN , Dae Ho LEE , Seok Jin LEE
IPC分类号: G10L19/06 , G10L19/032
CPC分类号: G10L19/06 , G10L19/032
摘要: A signal compression method and apparatus and a signal restoration method and apparatus are provided. The signal compression method includes outputting an input signal, obtained by processing an audio signal, which is input, based on a human auditory perception characteristic, using an auditory perception model, extracting a feature vector from the input signal using a feature extraction module, and outputting a code obtained by compressing the feature vector using a trained signal compression model.
-
公开(公告)号:US20240357306A1
公开(公告)日:2024-10-24
申请号:US18426984
申请日:2024-01-30
发明人: Young Ho JEONG , Kyeongok KANG , Soo Young PARK , Jae-hyoun YOO , Yong Ju LEE , Tae Jin LEE , Dae Young JANG
IPC分类号: H04S7/00
CPC分类号: H04S7/303 , H04S2400/11
摘要: A bitstream reconstruction method and apparatus are provided. The method includes constructing an initial bitstream by rendering sound source information and geometry information within a reference radius from an initial location of a user accessing a virtual space into spatial audio, collecting location information according to a movement of the user within the virtual space, and reconstructing, based on a relationship between the reference radius and a movement radius identified according to the collected location information, the initial bitstream constructed by corresponding to the initial location of the user.
-
3.
公开(公告)号:US20240153513A1
公开(公告)日:2024-05-09
申请号:US18502648
申请日:2023-11-06
发明人: Byeong Ho CHO , Seung Kwon BEACK , Jong Mo SUNG , Tae Jin LEE , Woo Taek LIM , In Seon JANG
IPC分类号: G10L19/035
CPC分类号: G10L19/035
摘要: A complex number quantization-based audio signal encoding method may comprise: estimating a scale factor for each subband of an input audio signal; performing complex magnitude scaling for each subband based on the scale factor; and performing polar quantization on a complex frequency coefficient for each subband, wherein the performing the polar quantization for each subband comprises applying two or more different magnitude quantization techniques based on the magnitude of the complex frequency coefficient scaled for each subband.
-
4.
公开(公告)号:US20240136993A1
公开(公告)日:2024-04-25
申请号:US18480259
申请日:2023-10-02
发明人: Yong Ju LEE , Jae-hyoun YOO , Dae Young JANG , Soo Young PARK , Young Ho JEONG , Kyeongok KANG , Tae Jin LEE
IPC分类号: H03G7/00
CPC分类号: H03G7/007
摘要: A rendering method of an object-based audio signal and an apparatus for performing the same are provided. The rendering method of an object-based audio signal includes obtaining a rendered audio signal, performing clipping prevention on the rendered audio signal using a first limiter, mixing a signal output by the first limiter using a mixer, and performing clipping prevention on the mixed signal using a second limiter.
-
公开(公告)号:US20240129682A1
公开(公告)日:2024-04-18
申请号:US18484117
申请日:2023-10-10
发明人: Yong Ju LEE , Jae-hyoun YOO , Dae Young JANG , Soo Young PARK , Young Ho JEONG , Kyeongok KANG , Tae Jin LEE
IPC分类号: H04S7/00
CPC分类号: H04S7/30 , H04S2400/11
摘要: A method of rendering object-based audio and an electronic device performing the method are provided. The method includes identifying a bitstream, determining a reference distance of an object sound source based on the bitstream, determining a minimum distance for applying distance-dependent attenuation, based on the reference distance, and determining a gain of object-based audio included in the bitstream based on the reference distance and the minimum distance.
-
6.
公开(公告)号:US20230274141A1
公开(公告)日:2023-08-31
申请号:US18166407
申请日:2023-02-08
申请人: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , YONSEI UNIVERSITY WONJU INDUSTRY-ACADEMIC COOPERATION FOUNDATION
发明人: Jongmo SUNG , Seung Kwon BEACK , Tae Jin LEE , Woo-taek LIM , Inseon JANG , Byeongho CHO , Young Cheol PARK , Joon BYUN , Seungmin SHIN
IPC分类号: G06N3/08 , G10L19/038 , G10L25/30 , G10L19/028 , G10L25/69 , G10L25/60
CPC分类号: G06N3/08 , G10L19/038 , G10L25/30 , G10L19/028 , G10L25/69 , G10L25/60
摘要: Provided is a method and apparatus for designing and testing an audio codec using quantization based on white noise modeling. A neural network-based audio encoder design method includes generating a quantized latent vector and a reconstructed signal corresponding to an input signal by using a white noise modeling-based quantization process, computing a total loss for training a neural network-based audio codec, based on the input signal, the reconstruction signal, and the quantized latent vector, training the neural network-based audio codec by using the total loss, and validating the trained neural network-based audio codec to select the best neural network-based audio codec.
-
7.
公开(公告)号:US20230177331A1
公开(公告)日:2023-06-08
申请号:US18060405
申请日:2022-11-30
发明人: Young Ho JEONG , Soo Young PARK , Tae Jin LEE
IPC分类号: G06N3/08
CPC分类号: G06N3/08
摘要: Disclosed are methods of training a deep learning model and predicting a class and an electronic device for performing the methods. A method of training a deep learning model may include identifying training data labeled for each class, determining whether to augment the training data based on overall recognition performance indicating prediction accuracy of the deep learning model calculated in a previous epoch, augmenting the training data based on class-specific recognition performance indicating class-specific prediction accuracy of the deep learning model calculated in the previous epoch, predicting a class by inputting the training data or the training data that is augmented to the deep learning model according to a determination of whether to augment the training data, and training the deep learning model based on a labeled class and the predicted class.
-
公开(公告)号:US20230048402A1
公开(公告)日:2023-02-16
申请号:US17884364
申请日:2022-08-09
发明人: Jongmo SUNG , Seung Kwon BEACK , Tae Jin LEE , Woo-taek KIM , Inseon JANG
摘要: Provided is an encoding method according to various example embodiments and an encoder performing the method. The encoding method includes outputting a linear prediction(LP) coefficients bitstream and a residual signal by performing a linear prediction analysis on an input signal, outputting a first latent signal obtained by encoding a periodic component of the residual signal, using a first neural network module, outputting a first bitstream obtained by quantizing the first latent signal, using a quantization module, outputting a second latent signal obtained by encoding an aperiodic component of the residual signal, using the first neural network module, and outputting a second bitstream obtained by quantizing the second latent signal, using the quantization module, wherein the aperiodic component of the residual signal is calculated based on a periodic component of the residual signal decoded from the quantized first latent signal output by de-quantizing the first bitstream.
-
公开(公告)号:US20220262378A1
公开(公告)日:2022-08-18
申请号:US17672041
申请日:2022-02-15
发明人: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Inseon JANG
摘要: An audio signal encoding and decoding method using a learning model, a training method of the learning model, and an encoder and decoder that perform the method, are disclosed. The audio signal decoding method may include extracting a first residual signal and a first linear prediction coefficient by decoding a bitstream received from an encoder, generating a first audio signal from the first residual signal using the first linear prediction coefficient, generating a second linear prediction coefficients and a second residual signal from the first audio signal, obtaining a third linear prediction coefficient by inputting the second linear prediction coefficient into a trained learning model, and generating a second audio signal from the second residual signal using the third linear prediction coefficient.
-
公开(公告)号:US20220238126A1
公开(公告)日:2022-07-28
申请号:US17570489
申请日:2022-01-07
发明人: Jongmo SUNG , Seung Kwon BEACK , Tae Jin LEE , Woo-taek LIM , Inseon JANG
IPC分类号: G10L19/032 , G10L19/008 , G10L25/90 , G10L25/30
摘要: Methods of encoding and decoding an audio signal using a learning model and an encoder and a decoder for performing the methods are disclosed. A method of encoding an audio signal using a learning model may include extracting pitch information of the audio signal, determining a dilation factor of a receptive field of a first expandable neural network block to extract a feature map from the audio signal based on the pitch information, generating a first feature map of the audio signal using the first expandable neural network block in which the dilation factor is determined, determining a second feature map by inputting the first feature map into a second expandable neural network block to process the first feature map, and converting the second feature map and the pitch information into a bitstream.
-
-
-
-
-
-
-
-
-