基于全局变化空间及深度学习混合建模的声纹识别方法

发明公开

CN105575394A 基于全局变化空间及深度学习混合建模的声纹识别方法无效 - 驳回

请登陆查看更多内容

专利标题： 基于全局变化空间及深度学习混合建模的声纹识别方法
专利标题（英）： Voiceprint identification method based on global change space and deep learning hybrid modeling
申请号： CN201610000675.5

申请日： 2016-01-04
公开(公告)号： CN105575394A

公开(公告)日： 2016-05-11
发明人: 徐明星 , 车浩
申请人： 北京时代瑞朗科技有限公司
申请人地址： 北京市海淀区上地东路35号颐泉汇写字楼B座318
专利权人： 北京时代瑞朗科技有限公司
当前专利权人： 极限元(北京)智能科技股份有限公司
当前专利权人地址： 北京市海淀区上地东路35号颐泉汇写字楼B座318
主分类号： G10L17/10
IPC分类号： G10L17/10 ; G10L17/04 ; G10L17/02

摘要：

本发明公开一种基于全局变化空间及深度学习混合建模的声纹识别方法，包括以下步骤：获取语音段训练数据，采用全局变化空间建模的方法进行身份认证矢量，获取TVM-IVECTOR；采用深度神经网络的方法进行训练，获取NN-IVECTOR；将同一个音频文件的两个向量进行融合，得到新的I-VECTOR特征提取器；对于待测试音频，将TVM-IVECTOR和NN-IVECTOR两个向量融合后，提取最终的I-VECTOR；经过信道补偿后，与模型库中的说话人模型进行打分识别，得到识别结果。本发明方法对环境不匹配、多信道变化以及噪声等环境因素的干扰具有更强的鲁棒性，能够提高声纹识别方法的性能。

摘要（英）：

The invention discloses a voiceprint identification method based on global change space and deep learning hybrid modeling, comprising the steps of: obtaining voice segment training data, employing a global change space modeling method to perform an identity authentication vector to obtain a TVM-IVECTOR; employing a deep neural network method to perform training to obtain an NN-IVECTOR; fusing two vectors of a same audio frequency file to obtain a new I-IVECTOR characteristic extractor; for the audio frequency to be tested, fusing the TVM-IVECTOR and the NN-IVECTOR, and then extracting a final I-IVECTOR; and after channel compensation, performing rating identification on the speaker model in a model base to obtain an identification result. The voiceprint identification method possesses greater robustness to environmental factor interference such as environment mismatching, multiple channel change and noise, and can improve voiceprint identification method performance.

信息查询

中国专利公布公告 Global Dossier Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L17/00	讲话者辨认或验证
G10L17/06	.决策方法，模式适配策略
G10L17/10	..多模态系统,即基于多个识别引擎的集成或专家系统的融合