Multi-person speech separation method and apparatus using a generative adversarial network model
摘要:
A multi-person speech separation method is provided for a terminal. The method includes extracting a hybrid speech feature from a hybrid speech signal requiring separation, N human voices being mixed in the hybrid speech signal, N being a positive integer greater than or equal to 2; extracting a masking coefficient of the hybrid speech feature by using a generative adversarial network (GAN) model, to obtain a masking matrix corresponding to the N human voices, wherein the GAN model comprises a generative network model and an adversarial network model; and performing a speech separation on the masking matrix corresponding to the N human voices and the hybrid speech signal by using the GAN model, and outputting N separated speech signals corresponding to the N human voices.
公开/授权文献
信息查询
0/0