Method and apparatus for time-synchronized translation and synthesis of natural-language speech
    1.
    发明授权
    Method and apparatus for time-synchronized translation and synthesis of natural-language speech 失效
    时间同步翻译和综合自然语言语言的方法和装置

    公开(公告)号:US06556972B1

    公开(公告)日:2003-04-29

    申请号:US09526986

    申请日:2000-03-16

    IPC分类号: G10L2100

    摘要: A multi-lingual time-synchronized translation system and method provide automatic time-synchronized spoken translations of spoken phrases. The multi-lingual time-synchronized translation system includes a phrase-spotting mechanism, optionally, a language understanding mechanism, a translation mechanism, a speech output mechanism and an event measuring mechanism. The phrase-spotting mechanism identifies a spoken phrase from a restricted domain of phrases. The language understanding mechanism, if present, maps the identified phrase onto a small set of formal phrases. The translation mechanism maps the formal phrase onto a well-formed phrase in one or more target languages. The speech output mechanism produces high-quality output speech using the output of the event measuring mechanism for time synchronization. The event-measuring mechanism measures the duration of various key events in the source phrase. Event duration could be, for example, the overall duration of the input phrase, the duration of the phrase with interword silences omitted, or some other relevant durational features. The present invention recognizes the quality improvements can be achieved by restricting the task domain under consideration.

    摘要翻译: 多语言时间同步翻译系统和方法提供口语短语的自动时间同步口译。 多语言时间同步翻译系统包括短语识别机制,可选地,语言理解机制,翻译机制,语音输出机制和事件测量机制。 短语识别机制从短语的受限域识别口语短语。 语言理解机制(如果存在)将识别的短语映射到一小组正式短语。 翻译机制将正式短语映射到一个或多个目标语言的格式正确的短语。 语音输出机制使用事件测量机构的输出来产生高质量的输出语音,用于时间同步。 事件测量机制衡量源短语中各种关键事件的持续时间。 事件持续时间可以是例如输入短语的总体持续时间,删除词语静音的短语的持续时间,或某些其他相关的持续时间特征。 本发明认识到可以通过限制所考虑的任务域来实现质量改进。

    Method and apparatus for translating natural-language speech using multiple output phrases
    2.
    发明授权
    Method and apparatus for translating natural-language speech using multiple output phrases 有权
    使用多个输出短语翻译自然语言语言的方法和装置

    公开(公告)号:US06859778B1

    公开(公告)日:2005-02-22

    申请号:US09526985

    申请日:2000-03-16

    摘要: A multi-lingual translation system that provides multiple output sentences for a given word or phrase. Each output sentence for a given word or phrase reflects, for example, a different emotional emphasis, dialect, accents, loudness or rates of speech. A given output sentence could be selected automatically, or manually as desired, to create a desired effect. For example, the same output sentence for a given word or phrase can be recorded three times, to selectively reflect excitement, sadness or fear. The multi-lingual translation system includes a phrase-spotting mechanism, a translation mechanism, a speech output mechanism and optionally, a language understanding mechanism or an event measuring mechanism or both. The phrase-spotting mechanism identifies a spoken phrase from a restricted domain of phrases. The language understanding mechanism, if present, maps the identified phrase onto a small set of formal phrases. The translation mechanism maps the formal phrase onto a well-formed phrase in one or more target languages. The speech output mechanism produces high-quality output speech. The speech output may be time synchronized to the spoken phrase using the output of the event measuring mechanism.

    摘要翻译: 多语言翻译系统,为给定的单词或短语提供多个输出句子。 给定单词或短语的每个输出句反映出例如不同的情感强调,方言,口音,响度或语速。 给定的输出句子可以自动选择,或根据需要手动选择,以创建所需的效果。 例如,给定单词或短语的相同输出句子可以被记录三次,以选择性地反映兴奋,悲伤或恐惧。 多语言翻译系统包括短语识别机制,翻译机制,语音输出机制以及可选地,语言理解机制或事件测量机制或两者。 短语识别机制从短语的受限域识别口语短语。 语言理解机制(如果存在)将识别的短语映射到一小组正式短语。 翻译机制将正式短语映射到一个或多个目标语言的格式正确的短语。 语音输出机制产生高质量的输出语音。 语音输出可以使用事件测量机构的输出与语音短语进行时间同步。

    Non-leaf node penalty score assignment system and method for improving acoustic fast match speed in large vocabulary systems
    3.
    发明授权
    Non-leaf node penalty score assignment system and method for improving acoustic fast match speed in large vocabulary systems 有权
    非叶节点惩罚分数分配系统和方法,用于在大型词汇系统中提高声学快速匹配速度

    公开(公告)号:US06275801B1

    公开(公告)日:2001-08-14

    申请号:US09184870

    申请日:1998-11-03

    IPC分类号: G10L1514

    CPC分类号: G10L15/08

    摘要: A method for fast match processing, comprising two stages, a pre-processing stage and an on-line stage. The pre-processing stage comprises the steps of computing an a-priori probability of occurrence for each word from an acoustic vocabulary; deriving a penalty score for each word from said acoustic vocabulary based on each words a-priori probability of occurrence in an input text. The on-line stage operates on an input text stream, comprising the steps of, computing a path score for each word from said input text; combining the computed path score with the derived penalty score to form a combined score and testing the combined score against a threshold to determine top ranking candidate words.

    摘要翻译: 一种用于快速匹配处理的方法,包括两个阶段,一个预处理阶段和一个在线阶段。 预处理阶段包括以下步骤:从声学词汇计算出每个单词的先验概率; 基于每个单词在输入文本中出现的先验概率,从所述声学词汇导出每个单词的惩罚分数。 在线阶段对输入文本流进行操作,包括以下步骤:计算来自所述输入文本的每个单词的路径分数; 将计算的路径积分与导出的惩罚分数相结合以形成组合分数,并根据阈值测试组合分数以确定最高排名候选词。

    Semantic language modeling and confidence measurement
    4.
    发明申请
    Semantic language modeling and confidence measurement 有权
    语义语言建模和置信度测量

    公开(公告)号:US20050055209A1

    公开(公告)日:2005-03-10

    申请号:US10655838

    申请日:2003-09-05

    IPC分类号: G10L15/18 G10L15/28 G10L15/00

    CPC分类号: G10L15/1815

    摘要: A system and method for speech recognition includes generating a set of likely hypotheses in recognizing speech, rescoring the likely hypotheses by using semantic content by employing semantic structured language models, and scoring parse trees to identify a best sentence according to the sentence's parse tree by employing the semantic structured language models to clarify the recognized speech.

    摘要翻译: 一种用于语音识别的系统和方法包括在识别语音中产生一组可能的假设,通过使用语义结构化语言模型通过使用语义内容来重新计算可能的假设,并且通过采用语法结构语言模型对解析树进行评分以识别根据句子的解析树的最佳句子 语义结构语言模型来澄清公认的言语。

    Method, apparatus and computer program providing a multi-speaker database for concatenative text-to-speech synthesis
    5.
    发明申请
    Method, apparatus and computer program providing a multi-speaker database for concatenative text-to-speech synthesis 有权
    方法,装置和计算机程序提供用于并行文本到语音合成的多扬声器数据库

    公开(公告)号:US20060229876A1

    公开(公告)日:2006-10-12

    申请号:US11101223

    申请日:2005-04-07

    IPC分类号: G10L13/00

    CPC分类号: G10L13/07 G10L2021/0135

    摘要: A method, apparatus and a computer program product to generate an audible speech word that corresponds to text. The method includes providing a text word and, in response to the text word, processing pre-recorded speech segments that are derived from a plurality of speakers to selectively concatenate together speech segments based on at least one cost function to form audio data for generating an audible speech word that corresponds to the text word. A data structure is also provided for use in a concatenative text-to-speech system that includes a plurality of speech segments derived from a plurality of speakers, where each speech segment includes an associated attribute vector each of which is comprised of at least one attribute vector element that identifies the speaker from which the speech segment was derived.

    摘要翻译: 一种用于生成对应于文本的可听话语词的方法,装置和计算机程序产品。 该方法包括提供文本字,并且响应于文本字,处理从多个扬声器导出的预先记录的语音片段,以便基于至少一个成本函数选择性地将语音片段并置在一起,以形成用于生成 对应于文本字的声音语音字。 还提供了一种数据结构,用于包括从多个扬声器导出的多个语音段的级联文本到语音系统,其中每个语音段包括相关联的属性向量,每个语音段包括至少一个属性 标识从中导出语音段的扬声器的向量元素。

    Method and apparatus for fast semi-automatic semantic annotation
    8.
    发明申请
    Method and apparatus for fast semi-automatic semantic annotation 有权
    快速半自动语义注释的方法和装置

    公开(公告)号:US20060074634A1

    公开(公告)日:2006-04-06

    申请号:US10959523

    申请日:2004-10-06

    IPC分类号: G06F17/27

    CPC分类号: G06F17/271 G06F17/2755

    摘要: A method, apparatus and computer instructions is provided for fast semi-automatic semantic annotation. Given a limited annotated corpus, the present invention assigns a tag and a label to each word of the next limited annotated corpus using a parser engine, a similarity engine, and a SVM engine. A rover then combines the parse trees from the three engines and annotates the next chunk of limited annotated corpus with confidence, such that the efforts required for human annotation is reduced.

    摘要翻译: 提供了一种用于快速半自动语义注释的方法,装置和计算机指令。 给定有限的注释语料库,本发明使用解析器引擎,相似性引擎和SVM引擎向下一个有限注释语料库的每个单词分配标签和标签。 然后,流动站组合来自三个引擎的解析树,并自信地注释下一批有限注释语料库,从而减少人体注释所需的努力。

    Enhanced likelihood computation using regression in a speech recognition system
    9.
    发明授权
    Enhanced likelihood computation using regression in a speech recognition system 失效
    在语音识别系统中使用回归来增强似然计算

    公开(公告)号:US06493667B1

    公开(公告)日:2002-12-10

    申请号:US09368669

    申请日:1999-08-05

    IPC分类号: G10L1514

    CPC分类号: G10L15/144 G10L2015/085

    摘要: In order to achieve low error rates in a speech recognition system, for example, in a system employing rank-based decoding, we discriminate the most confusable incorrect leaves from the correct leaf by lowering their ranks. That is, we increase the likelihood of the correct leaf of a frame, while decreasing the likelihoods of the confusable leaves. In order to do this, we use the auxiliary information from the prediction of the neighboring frames to augment the likelihood computation of the current frame. We then use the residual errors in the predictions of neighboring frames to discriminate between the correct (best) and incorrect leaves of a given frame. We present a new methodology that incorporates prediction error likelihoods into the overall likelihood computation to improve the rank position of the correct leaf.

    摘要翻译: 为了在语音识别系统中实现低错误率,例如,在采用基于秩解码的系统中,我们通过降低他们的等级来区分来自正确叶片的最混淆的不正确的叶子。 也就是说,我们增加了一帧正确叶片的可能性,同时降低了可疑叶片的可能性。 为了做到这一点,我们使用来自相邻帧的预测的辅助信息来增加当前帧的似然性计算。 然后,我们使用相邻帧的预测中的残差来区分给定帧的正确(最佳)和不正确的叶。 我们提出一种将预测误差可能性纳入总体似然计算的新方法,以提高正确叶子的排名。