-
公开(公告)号:US20230048402A1
公开(公告)日:2023-02-16
申请号:US17884364
申请日:2022-08-09
Inventor: Jongmo SUNG , Seung Kwon BEACK , Tae Jin LEE , Woo-taek KIM , Inseon JANG
Abstract: Provided is an encoding method according to various example embodiments and an encoder performing the method. The encoding method includes outputting a linear prediction(LP) coefficients bitstream and a residual signal by performing a linear prediction analysis on an input signal, outputting a first latent signal obtained by encoding a periodic component of the residual signal, using a first neural network module, outputting a first bitstream obtained by quantizing the first latent signal, using a quantization module, outputting a second latent signal obtained by encoding an aperiodic component of the residual signal, using the first neural network module, and outputting a second bitstream obtained by quantizing the second latent signal, using the quantization module, wherein the aperiodic component of the residual signal is calculated based on a periodic component of the residual signal decoded from the quantized first latent signal output by de-quantizing the first bitstream.
-
公开(公告)号:US20220262378A1
公开(公告)日:2022-08-18
申请号:US17672041
申请日:2022-02-15
Inventor: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Inseon JANG
Abstract: An audio signal encoding and decoding method using a learning model, a training method of the learning model, and an encoder and decoder that perform the method, are disclosed. The audio signal decoding method may include extracting a first residual signal and a first linear prediction coefficient by decoding a bitstream received from an encoder, generating a first audio signal from the first residual signal using the first linear prediction coefficient, generating a second linear prediction coefficients and a second residual signal from the first audio signal, obtaining a third linear prediction coefficient by inputting the second linear prediction coefficient into a trained learning model, and generating a second audio signal from the second residual signal using the third linear prediction coefficient.
-
公开(公告)号:US20220238126A1
公开(公告)日:2022-07-28
申请号:US17570489
申请日:2022-01-07
Inventor: Jongmo SUNG , Seung Kwon BEACK , Tae Jin LEE , Woo-taek LIM , Inseon JANG
IPC: G10L19/032 , G10L19/008 , G10L25/90 , G10L25/30
Abstract: Methods of encoding and decoding an audio signal using a learning model and an encoder and a decoder for performing the methods are disclosed. A method of encoding an audio signal using a learning model may include extracting pitch information of the audio signal, determining a dilation factor of a receptive field of a first expandable neural network block to extract a feature map from the audio signal based on the pitch information, generating a first feature map of the audio signal using the first expandable neural network block in which the dilation factor is determined, determining a second feature map by inputting the first feature map into a second expandable neural network block to process the first feature map, and converting the second feature map and the pitch information into a bitstream.
-
公开(公告)号:US20220157326A1
公开(公告)日:2022-05-19
申请号:US17507746
申请日:2021-10-21
Inventor: Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Woo-taek LIM , Inseon JANG
IPC: G10L19/13 , G10L19/032 , G10L19/06
Abstract: A method of generating a residual signal performed by an encoder includes identifying an input signal including an audio sample, generating a first residual signal from the input signal using linear predictive coding (LPC), generating a second residual signal having a less information amount than the first residual signal by transforming the first residual signal, transforming the second residual signal into a frequency domain, and generating a third residual signal having a less information amount than the second residual signal from the transformed second residual signal using frequency-domain prediction (FDP) coding.
-
公开(公告)号:US20210090551A1
公开(公告)日:2021-03-25
申请号:US17029960
申请日:2020-09-23
Applicant: Electronics and Telecommunications Research Institute , Industry-Academic Cooperation Foundation, Yonsei University
Inventor: Inseon JANG , Hong-Goo KANG , Chung Hyun AHN , Se-Yun UM , Sangshin OH , Tae Jin LEE
IPC: G10L13/08 , G10L25/63 , G10L13/033
Abstract: An emotional speech generating method and apparatus capable of adjusting an emotional intensity is disclosed. The emotional speech generating method includes generating emotion groups by grouping weight vectors representing a same emotion into a same emotion group, determining an internal distance between weight vectors included in a same emotion group, determining an external distance between weight vectors included in a same emotion group and weight vectors included in another emotion group, determining a representative weight vector of each of the emotion groups based on the internal distance and the external distance, generating a style embedding by applying the representative weight vector of each of the emotion groups to a style token including prosodic information for expressing an emotion, and generating an emotional speech expressing the emotion using the style embedding.
-
公开(公告)号:US20250006210A1
公开(公告)日:2025-01-02
申请号:US18747007
申请日:2024-06-18
Applicant: Electronics and Telecommunications Research Institute , UIF (University Industry Foundation), Yonsei University
Inventor: Woo-taek LIM , Inseon JANG , Seung Kwon BEACK , Hong-Goo KANG , Byeong Hyeon KIM , Jihyun LEE , Hyungseob LIM
IPC: G10L19/08
Abstract: A method of encoding/decoding a speech signal and a device for performing the same are provided. The method includes outputting, based on a first input speech signal of a previous timepoint and a second input speech signal of a current timepoint, a predicted signal that predicts the second input speech signal from the first input speech signal and obtaining, based on the second input speech signal and the predicted signal, a residual signal by removing a correlation between the first input speech signal and the second input speech signal from the second input speech signal.
-
公开(公告)号:US20240233738A9
公开(公告)日:2024-07-11
申请号:US18358646
申请日:2023-07-25
Applicant: Electronics and Telecommunications Research Institute , Gwangju Institute of Science and Technology
Inventor: Inseon JANG , Seung Kwon BEACK , Tae Jin LEE , Jongmo SUNG , Woo-taek LIM , Byeongho CHO , Jongwon SHIN
IPC: G10L19/02
CPC classification number: G10L19/02
Abstract: Provided is an encoding apparatus including a memory configured to store instructions and a processor electrically connected to the memory and configured to execute the instructions, wherein the processor may be configured to perform a plurality of operations, when the instructions are executed by the processor, wherein the plurality of operations may include obtaining an input audio signal, generating an embedded audio signal by embedding signal components of a second frequency band of the input audio signal in a first frequency band of the input audio signal, generating additional information associated with the first frequency band and the second frequency band, generating an encoded audio signal by encoding the embedded audio signal, and formatting the encoded audio signal and the additional information into a bitstream.
-
公开(公告)号:US20240055009A1
公开(公告)日:2024-02-15
申请号:US18349680
申请日:2023-07-10
Inventor: Byeongho CHO , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Woo-taek LIM , Inseon JANG
IPC: G10L19/032
CPC classification number: G10L19/032
Abstract: Provided are an apparatus for encoding an audio signal and a method of an operation thereof. An audio signal encoding method includes obtaining quantized linear prediction (LP) coefficients by performing a linear predictive coding (LPC) analysis and quantization on an input audio signal, generating a reference signal by applying discrete Fourier transform (DFT) to the input audio signal, obtaining LP residual coefficients from the reference signal, scaling magnitudes of the LP residual coefficients using the quantized LP coefficients and the reference signal, and quantizing phases of the LP residual coefficients and the scaled magnitudes of the LP residual coefficients.
-
19.
公开(公告)号:US20230230604A1
公开(公告)日:2023-07-20
申请号:US18099119
申请日:2023-01-19
Applicant: Electronics and Telecommunications Research Institute , Gwangju Institute of Science and Technology
Inventor: Inseon JANG , Tae Jin LEE , Seung Kwon BEACK , Jongmo SUNG , Woo-taek LIM , Byeongho CHO , Jongwon SHIN , Soojoong HWANG , Eunkyun LEE , Youngwon CHOI , Sangwook HAN
CPC classification number: G10L19/0204 , G10L25/30
Abstract: A method of encoding an audio signal and an encoder and a method of decoding an audio signal and a decoder are provided. The method of encoding an audio signal includes outputting a decoded signal by using a bitstream that encodes an audio signal, separating the decoded signal into a low-band signal and a high-band signal by using a sound source separator, upsampling the low-band signal, upsampling the high-band signal, and restoring the audio signal by synthesizing the upsampled low-band signal with the upsampled high-band signal, wherein the bitstream is generated by encoding a superimposed signal in which a signal in a high frequency band of the audio signal is superimposed on a low frequency band of the audio signal.
-
公开(公告)号:US20220358940A1
公开(公告)日:2022-11-10
申请号:US17527351
申请日:2021-11-16
Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , Gwangju Institute of Science and Technology
Inventor: Woo-taek LIM , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Inseon JANG , Jong Won SHIN , Soojoong HWANG , Youngju CHEON , Sangwook HAN
Abstract: Disclosed are methods of encoding and decoding an audio signal using side information, and an encoder and a decoder for performing the methods. The method of encoding an audio signal using side information includes identifying an input signal, the input signal being an original audio signal, extracting side information from the input signal using a learning model trained to extract side information from a feature vector of the input signal, encoding the input signal, and generating a bitstream by combining the encoded input signal and the side information.
-
-
-
-
-
-
-
-
-