Dimensionality reduction for speaker normalization and speaker and environment adaptation using eigenvoice techniques

发明授权

US06343267B1 Dimensionality reduction for speaker normalization and speaker and environment adaptation using eigenvoice techniques 有权

标题翻译：使用本征语音技术的扬声器归一化和扬声器和环境适应的尺寸减小

请登陆查看更多内容

专利标题： Dimensionality reduction for speaker normalization and speaker and environment adaptation using eigenvoice techniques
专利标题（中）： 使用本征语音技术的扬声器归一化和扬声器和环境适应的尺寸减小
申请号： US09148753

申请日： 1998-09-04
公开(公告)号： US06343267B1

公开(公告)日： 2002-01-29
发明人: Roland Kuhn , Patrick Nguyen , Jean-Claude Junqua
申请人： Roland Kuhn , Patrick Nguyen , Jean-Claude Junqua
主分类号： G10L1908
IPC分类号： G10L1908

Dimensionality reduction for speaker normalization and speaker and environment adaptation using eigenvoice techniques

摘要：

A set of speaker dependent models or adapted models is trained upon a comparatively large number of training speakers, one model per speaker, and model parameters are extracted in a predefined order to construct a set of supervectors, one per speaker. Dimensionality reduction is then performed on the set of supervectors to generate a set of eigenvectors that define an eigenvoice space. If desired, the number of vectors may be reduced to achieve data compression. Thereafter, a new speaker provides adaptation data from which a supervector is constructed by constraining this supervector to be in the eigenvoice space based on a maximum likelihood estimation. The resulting coefficients in the eigenspace of this new speaker may then be used to construct a new set of model parameters from which an adapted model is constructed for that speaker. The adapted model may then be further adapted via MAP, MLLR, MLED or the like. The eigenvoice technique may be applied to MLLR transformation matrices or the like; Bayesian estimation performed in eigenspace uses prior knowledge about speaker space density to refine the estimate about the location of a new speaker in eigenspace.

摘要（中）：

一组扬声器依赖模型或适应模型被训练在相对较多数量的训练扬声器上，每个扬声器一个模型和模型参数以预定义的顺序被提取以构造一组超级矢量，每个扬声器一个。然后对该一组超级矢量执行尺寸减小，以生成一组定义本征语音空间的特征向量。如果需要，可以减少向量的数量以实现数据压缩。此后，新的说话者提供了通过基于最大似然估计将该超向量限制在本征语音空间中来构建超向量的适配数据。然后，可以使用这个新的说话者的本征空间中得到的系数来构建一组新的模型参数，从该模型参数构建适合于该说话者的适应模型。然后可以通过MAP，MLLR，MLED等进一步适配适配模型。本征语音技术可以应用于MLLR变换矩阵等; 在本体空间中执行的贝叶斯估计使用关于扬声器空间密度的先前知识来改进关于本征空间中新的说话者位置的估计。

信息查询

Espacenet