专利检索 ap:("Mark Epstein" OR "Hakan Erdogan" OR "Yuqing Gao" OR "Michael Picheny" OR "Ruhi Sarikaya") AND inv:"Michael Picheny" 第 1 页

1.

发明申请
Semantic language modeling and confidence measurement 有权
标题翻译：语义语言建模和置信度测量

公开(公告)号：US20050055209A1

公开(公告)日：2005-03-10

申请号：US10655838

申请日：2003-09-05

申请人： Mark Epstein , Hakan Erdogan , Yuqing Gao , Michael Picheny , Ruhi Sarikaya

发明人： Mark Epstein , Hakan Erdogan , Yuqing Gao , Michael Picheny , Ruhi Sarikaya

IPC分类号： G10L15/18 , G10L15/28 , G10L15/00

CPC分类号： G10L15/1815

摘要： A system and method for speech recognition includes generating a set of likely hypotheses in recognizing speech, rescoring the likely hypotheses by using semantic content by employing semantic structured language models, and scoring parse trees to identify a best sentence according to the sentence's parse tree by employing the semantic structured language models to clarify the recognized speech.

摘要翻译： 一种用于语音识别的系统和方法包括在识别语音中产生一组可能的假设，通过使用语义结构化语言模型通过使用语义内容来重新计算可能的假设，并且通过采用语法结构语言模型对解析树进行评分以识别根据句子的解析树的最佳句子语义结构语言模型来澄清公认的言语。

2.

发明申请
Method and apparatus for fast semi-automatic semantic annotation 有权
标题翻译：快速半自动语义注释的方法和装置

公开(公告)号：US20060074634A1

公开(公告)日：2006-04-06

申请号：US10959523

申请日：2004-10-06

申请人： Yuqing Gao , Michael Picheny , Ruhi Sarikaya

发明人： Yuqing Gao , Michael Picheny , Ruhi Sarikaya

IPC分类号： G06F17/27

CPC分类号： G06F17/271 , G06F17/2755

摘要： A method, apparatus and computer instructions is provided for fast semi-automatic semantic annotation. Given a limited annotated corpus, the present invention assigns a tag and a label to each word of the next limited annotated corpus using a parser engine, a similarity engine, and a SVM engine. A rover then combines the parse trees from the three engines and annotates the next chunk of limited annotated corpus with confidence, such that the efforts required for human annotation is reduced.

摘要翻译： 提供了一种用于快速半自动语义注释的方法，装置和计算机指令。给定有限的注释语料库，本发明使用解析器引擎，相似性引擎和SVM引擎向下一个有限注释语料库的每个单词分配标签和标签。然后，流动站组合来自三个引擎的解析树，并自信地注释下一批有限注释语料库，从而减少人体注释所需的努力。

3.

发明授权
Enhanced likelihood computation using regression in a speech recognition system 失效
标题翻译：在语音识别系统中使用回归来增强似然计算

公开(公告)号：US06493667B1

公开(公告)日：2002-12-10

申请号：US09368669

申请日：1999-08-05

申请人： Peter V. de Souza , Yuqing Gao , Michael Picheny , Bhuvana Ramabhadran

发明人： Peter V. de Souza , Yuqing Gao , Michael Picheny , Bhuvana Ramabhadran

IPC分类号： G10L1514

CPC分类号： G10L15/144 , G10L2015/085

摘要： In order to achieve low error rates in a speech recognition system, for example, in a system employing rank-based decoding, we discriminate the most confusable incorrect leaves from the correct leaf by lowering their ranks. That is, we increase the likelihood of the correct leaf of a frame, while decreasing the likelihoods of the confusable leaves. In order to do this, we use the auxiliary information from the prediction of the neighboring frames to augment the likelihood computation of the current frame. We then use the residual errors in the predictions of neighboring frames to discriminate between the correct (best) and incorrect leaves of a given frame. We present a new methodology that incorporates prediction error likelihoods into the overall likelihood computation to improve the rank position of the correct leaf.

摘要翻译： 为了在语音识别系统中实现低错误率，例如，在采用基于秩解码的系统中，我们通过降低他们的等级来区分来自正确叶片的最混淆的不正确的叶子。也就是说，我们增加了一帧正确叶片的可能性，同时降低了可疑叶片的可能性。为了做到这一点，我们使用来自相邻帧的预测的辅助信息来增加当前帧的似然性计算。然后，我们使用相邻帧的预测中的残差来区分给定帧的正确（最佳）和不正确的叶。我们提出一种将预测误差可能性纳入总体似然计算的新方法，以提高正确叶子的排名。

4.

发明授权
Method and apparatus for time-synchronized translation and synthesis of natural-language speech 失效
标题翻译：时间同步翻译和综合自然语言语言的方法和装置

公开(公告)号：US06556972B1

公开(公告)日：2003-04-29

申请号：US09526986

申请日：2000-03-16

申请人： Raimo Bakis , Mark Edward Epstein , William Stuart Meisel , Miroslav Novak , Michael Picheny , Ridley M. Whitaker

发明人： Raimo Bakis , Mark Edward Epstein , William Stuart Meisel , Miroslav Novak , Michael Picheny , Ridley M. Whitaker

IPC分类号： G10L2100

CPC分类号： G06F17/289 , G10L15/18 , G10L2015/088

摘要： A multi-lingual time-synchronized translation system and method provide automatic time-synchronized spoken translations of spoken phrases. The multi-lingual time-synchronized translation system includes a phrase-spotting mechanism, optionally, a language understanding mechanism, a translation mechanism, a speech output mechanism and an event measuring mechanism. The phrase-spotting mechanism identifies a spoken phrase from a restricted domain of phrases. The language understanding mechanism, if present, maps the identified phrase onto a small set of formal phrases. The translation mechanism maps the formal phrase onto a well-formed phrase in one or more target languages. The speech output mechanism produces high-quality output speech using the output of the event measuring mechanism for time synchronization. The event-measuring mechanism measures the duration of various key events in the source phrase. Event duration could be, for example, the overall duration of the input phrase, the duration of the phrase with interword silences omitted, or some other relevant durational features. The present invention recognizes the quality improvements can be achieved by restricting the task domain under consideration.

摘要翻译： 多语言时间同步翻译系统和方法提供口语短语的自动时间同步口译。多语言时间同步翻译系统包括短语识别机制，可选地，语言理解机制，翻译机制，语音输出机制和事件测量机制。短语识别机制从短语的受限域识别口语短语。语言理解机制（如果存在）将识别的短语映射到一小组正式短语。翻译机制将正式短语映射到一个或多个目标语言的格式正确的短语。语音输出机制使用事件测量机构的输出来产生高质量的输出语音，用于时间同步。事件测量机制衡量源短语中各种关键事件的持续时间。事件持续时间可以是例如输入短语的总体持续时间，删除词语静音的短语的持续时间，或某些其他相关的持续时间特征。本发明认识到可以通过限制所考虑的任务域来实现质量改进。

5.

发明申请
Method, apparatus and computer program providing a multi-speaker database for concatenative text-to-speech synthesis 有权
标题翻译：方法，装置和计算机程序提供用于并行文本到语音合成的多扬声器数据库

公开(公告)号：US20060229876A1

公开(公告)日：2006-10-12

申请号：US11101223

申请日：2005-04-07

申请人： Andrew Aaron , Ellen Eide , Wael Hamza , Michael Picheny , Charles Rutherfoord , Zhi Shuang , Maria Smith

发明人： Andrew Aaron , Ellen Eide , Wael Hamza , Michael Picheny , Charles Rutherfoord , Zhi Shuang , Maria Smith

IPC分类号： G10L13/00

CPC分类号： G10L13/07 , G10L2021/0135

摘要： A method, apparatus and a computer program product to generate an audible speech word that corresponds to text. The method includes providing a text word and, in response to the text word, processing pre-recorded speech segments that are derived from a plurality of speakers to selectively concatenate together speech segments based on at least one cost function to form audio data for generating an audible speech word that corresponds to the text word. A data structure is also provided for use in a concatenative text-to-speech system that includes a plurality of speech segments derived from a plurality of speakers, where each speech segment includes an associated attribute vector each of which is comprised of at least one attribute vector element that identifies the speaker from which the speech segment was derived.

摘要翻译： 一种用于生成对应于文本的可听话语词的方法，装置和计算机程序产品。该方法包括提供文本字，并且响应于文本字，处理从多个扬声器导出的预先记录的语音片段，以便基于至少一个成本函数选择性地将语音片段并置在一起，以形成用于生成对应于文本字的声音语音字。还提供了一种数据结构，用于包括从多个扬声器导出的多个语音段的级联文本到语音系统，其中每个语音段包括相关联的属性向量，每个语音段包括至少一个属性标识从中导出语音段的扬声器的向量元素。

6.

发明申请
Translating Between Spoken and Written Language 有权
标题翻译：口语和书面语言之间的翻译

公开(公告)号：US20120290299A1

公开(公告)日：2012-11-15

申请号：US13107001

申请日：2011-05-13

申请人： Sara H. Basson , Rick Hamilton , Dan Ning Jiang , Dimitri Kanevsky , David Nahamoo , Michael Picheny , Bhuvana Ramabhadran , Tara N. Sainath

发明人： Sara H. Basson , Rick Hamilton , Dan Ning Jiang , Dimitri Kanevsky , David Nahamoo , Michael Picheny , Bhuvana Ramabhadran , Tara N. Sainath

IPC分类号： G10L15/26

CPC分类号： G06F17/28 , G06F17/2785 , G06F17/289 , G10L15/063 , G10L15/26

摘要： Techniques for converting spoken speech into written speech are provided. The techniques include transcribing input speech via speech recognition, mapping each spoken utterance from input speech into a corresponding formal utterance, and mapping each formal utterance into a stylistically formatted written utterance.

摘要翻译： 提供了将语音转换为书面语音的技术。这些技术包括通过语音识别来转录输入语音，将来自输入语音的每个讲话话语映射成对应的形式发音，并将每个正式话语映射成风格格式化的书写话语。

7.

发明申请
Speech recognition utilizing multitude of speech features 失效
标题翻译：语音识别利用多种语音特征

公开(公告)号：US20050119885A1

公开(公告)日：2005-06-02

申请号：US10724536

申请日：2003-11-28

申请人： Scott Axelrod , Sreeram Balakrishnan , Stanley Chen , Yuging Gao , Ramesh Gopinath , Hong-Kwang Kuo , Benoit Maison , David Nahamoo , Michael Picheny , George Saon , Geoffrey Zweig

发明人： Scott Axelrod , Sreeram Balakrishnan , Stanley Chen , Yuging Gao , Ramesh Gopinath , Hong-Kwang Kuo , Benoit Maison , David Nahamoo , Michael Picheny , George Saon , Geoffrey Zweig

IPC分类号： G10L15/10 , G10L15/00 , G10L15/02 , G10L15/06 , G10L15/14

CPC分类号： G10L15/063 , G10L15/02 , G10L15/14 , G10L2015/085

摘要： In a speech recognition system, the combination of a log-linear model with a multitude of speech features is provided to recognize unknown speech utterances. The speech recognition system models the posterior probability of linguistic units relevant to speech recognition using a log-linear model. The posterior model captures the probability of the linguistic unit given the observed speech features and the parameters of the posterior model. The posterior model may be determined using the probability of the word sequence hypotheses given a multitude of speech features. Log-linear models are used with features derived from sparse or incomplete data. The speech features that are utilized may include asynchronous, overlapping, and statistically non-independent speech features. Not all features used in training need to appear in testing/recognition.

摘要翻译： 在语音识别系统中，提供了具有多个语音特征的对数线性模型的组合来识别未知语音语音。语音识别系统使用对数线性模型对与语音识别相关的语言单位的后验概率进行建模。后验模型捕获了语言单位给出观察到的语音特征和后验模型参数的概率。可以使用给定多个语音特征的单词序列假设的概率来确定后验模型。对数线性模型与来自稀疏或不完整数据的特征一起使用。所使用的语音特征可以包括异步，重叠和统计上非独立的语音特征。培训中使用的并非所有功能都需要出现在测试/识别中。

8.

发明授权
Non-leaf node penalty score assignment system and method for improving acoustic fast match speed in large vocabulary systems 有权
标题翻译：非叶节点惩罚分数分配系统和方法，用于在大型词汇系统中提高声学快速匹配速度

公开(公告)号：US06275801B1

公开(公告)日：2001-08-14

申请号：US09184870

申请日：1998-11-03

申请人： Miroslav Novak , Michael Picheny

发明人： Miroslav Novak , Michael Picheny

IPC分类号： G10L1514

CPC分类号： G10L15/08

摘要： A method for fast match processing, comprising two stages, a pre-processing stage and an on-line stage. The pre-processing stage comprises the steps of computing an a-priori probability of occurrence for each word from an acoustic vocabulary; deriving a penalty score for each word from said acoustic vocabulary based on each words a-priori probability of occurrence in an input text. The on-line stage operates on an input text stream, comprising the steps of, computing a path score for each word from said input text; combining the computed path score with the derived penalty score to form a combined score and testing the combined score against a threshold to determine top ranking candidate words.

摘要翻译： 一种用于快速匹配处理的方法，包括两个阶段，一个预处理阶段和一个在线阶段。预处理阶段包括以下步骤：从声学词汇计算出每个单词的先验概率; 基于每个单词在输入文本中出现的先验概率，从所述声学词汇导出每个单词的惩罚分数。在线阶段对输入文本流进行操作，包括以下步骤：计算来自所述输入文本的每个单词的路径分数; 将计算的路径积分与导出的惩罚分数相结合以形成组合分数，并根据阈值测试组合分数以确定最高排名候选词。

9.

发明授权
Text processing using natural language understanding 有权

公开(公告)号：US08856004B2

公开(公告)日：2014-10-07

申请号：US13107001

申请日：2011-05-13

申请人： Sara H. Basson , Rick Hamilton , Dan Ning Jiang , Dimitri Kanevsky , David Nahamoo , Michael Picheny , Bhuvana Ramabhadran , Tara N. Sainath

发明人： Sara H. Basson , Rick Hamilton , Dan Ning Jiang , Dimitri Kanevsky , David Nahamoo , Michael Picheny , Bhuvana Ramabhadran , Tara N. Sainath

IPC分类号： G10L15/26

CPC分类号： G06F17/28 , G06F17/2785 , G06F17/289 , G10L15/063 , G10L15/26

摘要： Techniques for converting spoken speech into written speech are provided. The techniques include transcribing input speech via speech recognition, mapping each spoken utterance from input speech into a corresponding formal utterance, and mapping each formal utterance into a stylistically formatted written utterance.

10.

发明申请
Methods and apparatus for adapting output speech in accordance with context of communication 有权
标题翻译：根据通信背景调整输出语音的方法和装置

公开(公告)号：US20060229873A1

公开(公告)日：2006-10-12

申请号：US11092057

申请日：2005-03-29

申请人： Ellen Eide , Wael Hamza , Michael Picheny

发明人： Ellen Eide , Wael Hamza , Michael Picheny

IPC分类号： G10L13/08

CPC分类号： G10L13/027 , G10L15/22

摘要： A technique for producing speech output in an automatic dialog system is provided. Communication is received from a user at the automatic dialog system. A context of the communication from the user is detected in a context detector of the automatic dialog system. A message is provided to the user from a text-to-speech system of the automatic dialog system in communication with the context detector, wherein the message is provided in accordance with the detected context of the communication.

摘要翻译： 提供了一种在自动对话系统中产生语音输出的技术。在自动对话系统中从用户接收通信。在自动对话系统的上下文检测器中检测来自用户的通信的上下文。从与上下文检测器通信的自动对话系统的文本到语音系统向用户提供消息，其中根据检测到的通信的上下文来提供消息。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类