Systems and methods for end-to-end architectures for voice spoofing detection
摘要:
Embodiments described herein provide for automatically detecting whether an audio signal is a spoofed audio signal or a genuine audio signal. A spoof detection system can include an audio signal transforming front end and a classification back end. Both the front end and the back end can include neural networks that can be trained using the same set of labeled audio signals. The audio signal transforming front end can include a one or more neural networks for per-channel energy normalization transformation of the audio signal, and the back end can include a convolution neural network for classification into spoofed or genuine audio signal. In some embodiments, the transforming audio signal front end can include one or more neural networks for bandpass filtering of the audio signals, and the back end can include a residual neural network for audio signal classification into spoofed or genuine audio signal.
信息查询
0/0