ADVERSARIAL SPEAKER ADAPTATION

发明申请

US20200335085A1 ADVERSARIAL SPEAKER ADAPTATION 审中-公开

请登陆查看更多内容

专利标题： ADVERSARIAL SPEAKER ADAPTATION
申请号： US16460027

申请日： 2019-07-02
公开(公告)号： US20200335085A1

公开(公告)日： 2020-10-22
发明人: Zhong MENG , Jinyu LI , Yifan GONG
申请人： Microsoft Technology Licensing, LLC
主分类号： G10L15/06
IPC分类号： G10L15/06 ; G10L15/02 ; G10L15/22

摘要：

Embodiments are associated with a speaker-independent acoustic model capable of classifying senones based on input speech frames and on first parameters of the speaker-independent acoustic model, a speaker-dependent acoustic model capable of classifying senones based on input speech frames and on second parameters of the speaker-dependent acoustic model, and a discriminator capable of receiving data from the speaker-dependent acoustic model and data from the speaker-independent acoustic model and outputting a prediction of whether received data was generated by the speaker-dependent acoustic model based on third parameters. The second parameters are initialized based on the first parameters, the second parameters are trained based on input frames of a target speaker to minimize a senone classification loss associated with the second parameters, a portion of the second parameters are trained based on the input frames of the target speaker to maximize a discrimination loss associated with the discriminator, and the third parameters are trained based on the input frames of the target speaker to minimize the discrimination loss.

公开/授权文献

US11107460B2 Adversarial speaker adaptation 公开/授权日：2021-08-31

信息查询

Global Dossier Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/06	.创建基准模板；训练语音识别系统，例如对说话者声音特征的适应（G10L15/14优先）