专利检索 ap:("Electronics AND Telecommunications Research Institute" OR "The Trustees of Indiana University") AND inv:"Haici Yang" 第 1 页

1.

发明授权
Methods of encoding and decoding speech signal using neural network model recognizing sound sources, and encoding and decoding apparatuses for performing the same 有权

公开(公告)号：US11664037B2

公开(公告)日：2023-05-30

申请号：US17326035

申请日：2021-05-20

申请人： Electronics and Telecommunications Research Institute , The Trustees of Indiana University

发明人： Woo-taek Lim , Seung Kwon Beack , Jongmo Sung , Mi Suk Lee , Tae Jin Lee , Inseon Jang , Minje Kim , Haici Yang

IPC分类号： G10L19/032 , G10L21/0272

CPC分类号： G10L19/032 , G10L21/0272

摘要： Methods of encoding and decoding a speech signal using a neural network model that recognizes sound sources, and encoding and decoding apparatuses for performing the methods are provided. A method of encoding a speech signal includes identifying an input signal for a plurality of sound sources; generating a latent signal by encoding the input signal; obtaining a plurality of sound source signals by separating the latent signal for each of the plurality of sound sources; determining a number of bits used for quantization of each of the plurality of sound source signals according to a type of each of the plurality of sound sources; quantizing each of the plurality of sound source signals based on the determined number of bits; and generating a bitstream by combining the plurality of quantized sound source signals.