System and method for discriminative pronunciation modeling for voice search

Invention Grant

US08296141B2 System and method for discriminative pronunciation modeling for voice search 有权

Title translation: 用于语音搜索的歧视性发音建模的系统和方法

Please log in to see more content

Patent Title: System and method for discriminative pronunciation modeling for voice search
Patent Title (中): 用于语音搜索的歧视性发音建模的系统和方法
Application No.: US12274025

Application Date: 2008-11-19
Publication No.: US08296141B2

Publication Date: 2012-10-23
Inventor: Mazin Gilbert , Alistair D. Conkie , Andrej Ljolje
Applicant: Mazin Gilbert , Alistair D. Conkie , Andrej Ljolje
Applicant Address: US GA Atlanta
Assignee: AT&T Intellectual Property I, L.P.
Current Assignee: AT&T Intellectual Property I, L.P.
Current Assignee Address: US GA Atlanta
Main IPC: G10L15/04
IPC: G10L15/04

System and method for discriminative pronunciation modeling for voice search

Abstract:

Disclosed herein are systems, computer-implemented methods, and computer-readable media for speech recognition. The method includes receiving speech utterances, assigning a pronunciation weight to each unit of speech in the speech utterances, each respective pronunciation weight being normalized at a unit of speech level to sum to 1, for each received speech utterance, optimizing the pronunciation weight by (1) identifying word and phone alignments and corresponding likelihood scores, and (2) discriminatively adapting the pronunciation weight to minimize classification errors, and recognizing additional received speech utterances using the optimized pronunciation weights. A unit of speech can be a sentence, a word, a context-dependent phone, a context-independent phone, or a syllable. The method can further include discriminatively adapting pronunciation weights based on an objective function. The objective function can be maximum mutual information (MMI), maximum likelihood (MLE) training, minimum classification error (MCE) training, or other functions known to those of skill in the art. Speech utterances can be names. The speech utterances can be received as part of a multimodal search or input. The step of discriminatively adapting pronunciation weights can further include stochastically modeling pronunciations.

Abstract(Chinese):

本文公开了用于语音识别的系统，计算机实现的方法和计算机可读介质。该方法包括接收语音话语，在语音话语中为每个语音单元分配发音权重，将每个相应的发音权重以语音级别为单位归一化为1，对于每个接收到的语音话语，通过（ 1）识别词和电话对齐和相应的可能性分数，以及（2）歧视地调整发音权重以最小化分类错误，以及使用优化的发音权重来识别附加的接收到的语音话语。语音单位可以是句子，单词，上下文相关的电话，与上下文无关的电话或音节。该方法还可以包括基于目标函数的歧视地适应发音权重。目标函数可以是本领域技术人员已知的最大相互信息（MMI），最大似然（MLE）训练，最小分类误差（MCE）训练或其他功能。言语言可以是名字。可以作为多模态搜索或输入的一部分接收演讲话语。歧视性地适应发音权重的步骤还可以包括随机建模发音。

Public/Granted literature

US20100125457A1 SYSTEM AND METHOD FOR DISCRIMINATIVE PRONUNCIATION MODELING FOR VOICE SEARCH Public/Granted day:2010-05-20

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/04	.分段；字极限检测