System and method for increasing recognition rates of in-vocabulary words by improving pronunciation modeling

发明授权

US08892441B2 System and method for increasing recognition rates of in-vocabulary words by improving pronunciation modeling 有权

标题翻译：通过改进发音建模来增加词汇单词识别率的系统和方法

请登陆查看更多内容

专利标题： System and method for increasing recognition rates of in-vocabulary words by improving pronunciation modeling
专利标题（中）： 通过改进发音建模来增加词汇单词识别率的系统和方法
申请号： US13311512

申请日： 2011-12-05
公开(公告)号： US08892441B2

公开(公告)日： 2014-11-18
发明人: Alistair D. Conkie , Mazin Gilbert , Andrej Ljolje
申请人： Alistair D. Conkie , Mazin Gilbert , Andrej Ljolje
申请人地址： US GA Atlanta
专利权人： AT&T Intellectual Property I, L.P.
当前专利权人： AT&T Intellectual Property I, L.P.
当前专利权人地址： US GA Atlanta
主分类号： G10L15/187
IPC分类号： G10L15/187 ; G10L15/06

System and method for increasing recognition rates of in-vocabulary words by improving pronunciation modeling

摘要：

The present disclosure relates to systems, methods, and computer-readable media for generating a lexicon for use with speech recognition. The method includes overgenerating potential pronunciations based on symbolic input, identifying potential pronunciations in a speech recognition context, and storing the identified potential pronunciations in a lexicon. Overgenerating potential pronunciations can include establishing a set of conversion rules for short sequences of letters, converting portions of the symbolic input into a number of possible lexical pronunciation variants based on the set of conversion rules, modeling the possible lexical pronunciation variants in one of a weighted network and a list of phoneme lists, and iteratively retraining the set of conversion rules based on improved pronunciations. Symbolic input can include multiple examples of a same spoken word. Speech data can be labeled explicitly or implicitly and can include words as text and recorded audio.

摘要（中）：

本公开涉及用于生成用于语音识别的词典的系统，方法和计算机可读介质。该方法包括基于符号输入过度生成潜在发音，识别语音识别语境中的潜在发音，以及将识别的潜在发音存储在词典中。过度生成潜在发音可以包括为短的字母序列建立一组转换规则，基于一组转换规则将符号输入的部分转换成许多可能的词汇发音变体，对可能的词汇发音变体在加权网络和音素列表，并且基于改进的发音迭代地重新训练一组转换规则。符号输入可以包括相同口语单词的多个示例。语音数据可以被明确地或隐含地标记，并且可以将单词包括为文本和记录的音频。

公开/授权文献

US20120078617A1 System and Method for Increasing Recognition Rates of In-Vocabulary Words By Improving Pronunciation Modeling 公开/授权日：2012-03-29

信息查询

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/08	.语音分类或检索
G10L15/18	..利用自然语言模型
G10L15/183	...用上下文相关性，例如：语言模型
G10L15/187	....语音上下文，例如：发音规则，声音策略限制，语音元语法