Real-time dynamic noise reduction using convolutional networks
摘要:
A system, method and computer readable medium for dynamic noise reduction in a voice call. The system includes an encoder having a short-time Fourier transform module to determine a magnitude spectrum and a phase spectrum of an input audio signal, including speech and dynamic noise. A separator coupled to the encoder comprises a temporal convolution network (TCN) used to develop a separation mask using the magnitude spectrum as input. The TCN is trained using a frequency SNR function used to calculate loss during training. A mixer is coupled to the separator to multiply the separation mask with the magnitude spectrum to separate the speech from the dynamic noise to obtain a denoise magnitude spectrum. A decoder coupled to the mixer and the encoder includes an inverse short-time Fourier transform module to reconstruct the input audio signal without the dynamic noise using the denoise magnitude spectrum and the phase spectrum.
信息查询
0/0