一种语种识别的方法、模型训练的方法、装置及设备

发明公开

CN110853618A 一种语种识别的方法、模型训练的方法、装置及设备有权

请登陆查看更多内容

专利标题： 一种语种识别的方法、模型训练的方法、装置及设备
专利标题（英）： Language recognition method and device, model training method and device, and facility
申请号： CN201911137417.1

申请日： 2019-11-19
公开(公告)号： CN110853618A

公开(公告)日： 2020-02-28
发明人: 高骥 , 张姗姗 , 黄申 , 巫海维 , 蔡炜城 , 李明
申请人： 腾讯科技(深圳)有限公司 , 昆山杜克大学
申请人地址： 广东省深圳市南山区高新区科技中一路腾讯大厦35层
专利权人： 腾讯科技(深圳)有限公司,昆山杜克大学
当前专利权人： 腾讯科技(深圳)有限公司,昆山杜克大学
当前专利权人地址： 广东省深圳市南山区高新区科技中一路腾讯大厦35层
代理机构： 深圳市深佳知识产权代理事务所
代理商 吴磊
主分类号： G10L15/00
IPC分类号： G10L15/00 ; G10L15/02 ; G10L15/06 ; G10L15/16 ; G10L21/0272 ; G10L21/0308 ; G10L25/18 ; G10L25/30

摘要：

本申请公开了一种语种识别的方法，包括：获取待识别音频数据；从所述待识别音频数据提取音频频域特征；基于所述音频频域特征，对所述待识别音频数据进行声伴数据分离，得到待识别语音数据，其中，所述声伴数据分离为从音频数据中分离出语音数据和伴奏数据；对所述待识别语音数据进行语种识别，得到所述待识别音频数据的语种识别结果。本申请还公开了一种模型训练的方法、装置及设备。本申请在语种识别模型中仅输入待识别语音数据，去除伴奏音乐的部分，从而减少了伴奏音乐对语种识别的干扰，从而提升了歌曲语种识别的准确度。

摘要（英）：

The invention discloses a language recognition method. The method comprises the following steps: acquiring to-be-recognized audio data; extracting audio frequency domain features from the to-be-recognized audio data; based on the audio frequency domain features, carrying out sound accompanying data separation on the to-be-recognized audio data, thus obtaining to-be-recognized voice data, wherein the sound accompanying data separation is to separate the voice data and accompanying data from the audio data; and carrying out language recognition on the to-be-recognized voice data, thus obtaininga language recognition result of the to-be-recognized audio data. The invention further discloses a model training method, devices and a facility. According to the technical scheme, only the to-be-recognized voice data is input into a language recognition model, the part of the accompanying music is removed, so that the interference of the accompanying music on language recognition is reduced, andthe accuracy of song language recognition is improved.

公开/授权文献

CN110853618B 一种语种识别的方法、模型训练的方法、装置及设备公开/授权日：2022-08-19

信息查询

中国专利公布公告 Global Dossier Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）