-
公开(公告)号:US06029132A
公开(公告)日:2000-02-22
申请号:US70300
申请日:1998-04-30
申请人: Roland Kuhn , Jean-Claude Junqua
发明人: Roland Kuhn , Jean-Claude Junqua
CPC分类号: G10L13/08
摘要: A two-stage pronunciation generator utilizes mixed decision trees that includes a network of yes-no questions about letter, syntax, context, and dialect in a spelled word sequence. A second stage utilizes decision trees that includes a network of yes-no questions about adjacent phonemes in the phoneme sequence corresponding to the spelled word sequence. Leaf nodes of the mixed decision trees provide information about which phonetic transcriptions are most probable. Using the mixed trees, scores are developed for each of a plurality of possible pronunciations, and these scores can be used to select the best pronunciation as well as to rank pronunciations in order of probability. The pronunciations generated by the system can be used in speech synthesis and speech recognition applications as well as lexicography applications.
摘要翻译: 两阶段发音生成器利用混合决策树,其中包含有拼写单词序列中关于字母,语法,上下文和方言的是 - 否问题的网络。 第二阶段利用对应于拼写单词序列的音素序列中包含关于相邻音素的是 - 否问题的网络的决策树。 混合决策树的叶节点提供了哪些语音转录最有可能的信息。 使用混合树,为多个可能的发音中的每一个开发分数,并且这些分数可以用于选择最佳发音以及按概率的排序排列发音。 系统生成的发音可用于语音合成和语音识别应用以及词典应用。
-
82.
公开(公告)号:US5892813A
公开(公告)日:1999-04-06
申请号:US723913
申请日:1996-09-30
IPC分类号: G10L15/00 , G10L13/00 , G10L15/18 , G10L15/28 , H04M1/00 , H04M1/247 , H04M1/253 , H04M1/27 , H04M3/42
摘要: The multimodal telephone prompts the user using both a visual display and synthesized voice. It receives user input via keypad and programmable soft keys associated with the display, and also through user-spoken commands. The voice module includes a two stage speech recognizer that models speech in terms of high similarity values. A dialog manager associated with the voice module maintains the visual and verbal systems in synchronism with one another. The dialog manager administers a state machine that records the dialog context. The dialog context is used to ensure that the appropriate visual prompts are displayed--showing what commands are possible at any given point in the dialog. The speech recognizer also uses the dialog context to select the recognized word candidate that is appropriate to the current context.
摘要翻译: 多模式电话提示用户同时使用视觉显示和合成语音。 它通过键盘和与显示器相关联的可编程软键接收用户输入,还可以通过用户口令命令。 语音模块包括两级语音识别器,其以高相似度值对语音进行建模。 与语音模块相关联的对话管理器将视觉和语言系统彼此同步地维持。 对话管理器管理记录对话框上下文的状态机。 对话框上下文用于确保显示适当的视觉提示,显示对话框中任意给定点可以执行哪些命令。 语音识别器还使用对话上下文来选择适合于当前上下文的识别词候选。
-