- Patent Title: Methods of encoding and decoding speech signal using neural network model recognizing sound sources, and encoding and decoding apparatuses for performing the same
-
Application No.: US17326035Application Date: 2021-05-20
-
Publication No.: US11664037B2Publication Date: 2023-05-30
- Inventor: Woo-taek Lim , Seung Kwon Beack , Jongmo Sung , Mi Suk Lee , Tae Jin Lee , Inseon Jang , Minje Kim , Haici Yang
- Applicant: Electronics and Telecommunications Research Institute , The Trustees of Indiana University
- Applicant Address: KR IN Daejeon
- Assignee: Electronics and Telecommunications Research Institute,The Trustees of Indiana University
- Current Assignee: Electronics and Telecommunications Research Institute,The Trustees of Indiana University
- Current Assignee Address: KR Daejeon; US IN Indianapolis
- Agency: William Park & Associates Ltd.
- Priority: KR 20210053581 2021.04.26
- Main IPC: G10L19/032
- IPC: G10L19/032 ; G10L21/0272

Abstract:
Methods of encoding and decoding a speech signal using a neural network model that recognizes sound sources, and encoding and decoding apparatuses for performing the methods are provided. A method of encoding a speech signal includes identifying an input signal for a plurality of sound sources; generating a latent signal by encoding the input signal; obtaining a plurality of sound source signals by separating the latent signal for each of the plurality of sound sources; determining a number of bits used for quantization of each of the plurality of sound source signals according to a type of each of the plurality of sound sources; quantizing each of the plurality of sound source signals based on the determined number of bits; and generating a bitstream by combining the plurality of quantized sound source signals.
Public/Granted literature
Information query