Methods and systems of routing utterances based on confidence estimates
    1.
    发明授权
    Methods and systems of routing utterances based on confidence estimates 有权
    基于置信估计路由话语的方法和系统

    公开(公告)号:US07003456B2

    公开(公告)日:2006-02-21

    申请号:US09878173

    申请日:2001-06-12

    IPC分类号: G10L15/26

    CPC分类号: G10L15/32 G10L15/10

    摘要: A computer-based method of routing a message to a system includes receiving a message, and processing the message using large-vocabulary continuous speech recognition to generate a string of text corresponding to the message. The method includes generating a confidence estimate of the string of text corresponding to the message and comparing the confidence estimate to a predetermined threshold. If the confidence estimate satisfies the predetermined threshold, the string of text is forwarded to the system. If the confidence estimate does not satisfy the predetermined threshold, the information relating to the message is forwarded to a transcriptionist. The message may include one or more utterances. Each utterance in the message may be separately or jointly processed. In this way, a confidence estimate may be generated and evaluated for each utterance or for the whole message. Information relating to each utterance may be separately or jointly forwarded based on the results of the generation and evaluation.

    摘要翻译: 将消息路由到系统的基于计算机的方法包括接收消息,以及使用大词汇连续语音识别处理消息以生成与消息对应的文本串。 该方法包括生成对应于消息的文本串的置信估计,并将置信度估计与预定阈值进行比较。 如果置信度估计满足预定阈值,那么文本串将被转发给系统。 如果置信度估计不满足预定阈值,与消息相关的信息被转发给记录者。 消息可以包括一个或多个话语。 消息中的每个话语可以单独或联合处理。 以这种方式,可以为每个话语或整个消息产生和评估置信估计。 与每个话语有关的信息可以根据生成和评估的结果单独或共同转发。

    Sequential, nonparametric speech recognition and speaker identification
    2.
    发明授权
    Sequential, nonparametric speech recognition and speaker identification 失效
    顺序,非参数语音识别和说话人识别

    公开(公告)号:US6029124A

    公开(公告)日:2000-02-22

    申请号:US50428

    申请日:1998-03-31

    IPC分类号: G10L15/08 G10L17/00 G10L9/00

    摘要: A speech sample is evaluated using a computer. Training data that include samples of speech are received and stored along with identification of speech elements to which portions of the training data are related. A speech sample is received and speech recognition is performed on the speech sample to produce recognition results. Finally, the recognition results are evaluated in view of the training data and the identification of the speech elements to which the portions of the training data are related. The technique may be used to perform tasks such as speech recognition, speaker identification, and language identification.

    摘要翻译: 使用计算机评估语音样本。 接收并存储包括语音样本的训练数据以及与训练数据的一部分相关联的语音元素的标识。 接收到语音样本,并对语音样本进行语音识别,以产生识别结果。 最后,鉴于训练数据和训练数据的部分与之相关的语音元素的识别来评估识别结果。 该技术可用于执行诸如语音识别,说话者识别和语言识别等任务。

    Method of producing alternate utterance hypotheses using auxiliary information on close competitors
    3.
    发明授权
    Method of producing alternate utterance hypotheses using auxiliary information on close competitors 有权
    使用辅助信息在密切的竞争对手上产生替代发音假设的方法

    公开(公告)号:US07676367B2

    公开(公告)日:2010-03-09

    申请号:US10783518

    申请日:2004-02-20

    IPC分类号: G10L15/04 G10L15/00

    摘要: A method of constructing a list of alternate transcripts from a recognized transcript includes generating a list of close call records, matching partial sub-histories from the recognized transcript with one of the history pairs stored in each of the records, and substituting the other of the history pairs for the partial sub-history of the recognized transcript. A close call record is generated each time a pair of partial hypotheses attempt to seed a common word. Each close call record includes history information and scoring information associated with a particular pair of partial hypotheses seeding a common word. Alternate transcripts are constructed by substituting close call histories for partial histories of the recognized transcripts, and also by substituting close call histories for partial histories of other alternate transcript.

    摘要翻译: 从识别的记录中构建候选抄本的列表的方法包括生成紧密呼叫记录的列表,将来自所识别抄本的部分子历史与存储在每个记录中的历史对之一进行匹配, 历史对对于识别的成绩单的部分子历史记录。 每当一对部分假设尝试种植一个共同词时,就会产生一个接近通话记录。 每个近距离通话记录包括历史信息和与特定的一对部分假设相关联的评分信息,播种公共字。 替代的成绩单是通过将认可的记录的部分历史代替关闭呼叫历史,并通过替代其他替代记录的部分历史的近距离呼叫历史来代替。

    Lexical tree pre-filtering in speech recognition
    4.
    发明授权
    Lexical tree pre-filtering in speech recognition 失效
    词汇树预处理语音识别

    公开(公告)号:US5822730A

    公开(公告)日:1998-10-13

    申请号:US701393

    申请日:1996-08-22

    IPC分类号: G10L15/08 G10L15/18 G10L5/06

    摘要: A speech recognition technique uses lexical tree pre-filtering to obtain lists of words for use in performing speech recognition. The lexical tree pre-filtering includes representing a vocabulary of words using a lexical tree and identifying a first subset of the vocabulary that may correspond to speech spoken beginning at a first time by propagating through the lexical tree information about the speech spoken beginning at the first time. A second subset of the vocabulary that may correspond to speech spoken beginning at a second time is identified by propagating through the lexical tree information about the speech spoken beginning at the second time. Words included in the speech are recognized by comparing speech spoken beginning at the first time with words from the first subset of the vocabulary and speech spoken beginning at the second time with words from the second subset of the vocabulary. The state of the lexical tree is not reset between identifying the first and second subsets.

    摘要翻译: 语音识别技术使用词法树预滤波来获得用于执行语音识别的单词列表。 词汇树预过滤包括使用词汇树表示单词的词汇表,并通过从第一时间开始通过词汇树传播关于所说出的语音的信息来识别可能对应于在第一时间开始讲的语音的词汇表的第一子集 时间。 可以对应于在第二时间开始的语音对应的词汇表的第二个子集通过从第二次开始的词汇传播通过词汇树信息来识别。 包括在语音中的词通过比较从第一次开始的语音与来自第二时间开始的词汇的第一个子集和来自词汇的第二个子集的单词的语言进行比较来识别。 在识别第一和第二个子集之间,词汇树的状态不会被重置。

    Multilingual speech recognition
    6.
    发明授权
    Multilingual speech recognition 有权
    多语言语音识别

    公开(公告)号:US08065144B1

    公开(公告)日:2011-11-22

    申请号:US12699172

    申请日:2010-02-03

    IPC分类号: G10L15/06 G10L15/28

    CPC分类号: G10L15/005

    摘要: A method for speech recognition. The method uses a single pronunciation estimator to train acoustic phoneme models and recognize utterances from multiple languages. The method includes accepting text spellings of training words in a plurality of sets of training words, each set corresponding to a different one of a plurality of languages. The method also includes, for each of the sets of training words in the plurality, receiving pronunciations for the training words in the set, the pronunciations being characteristic of native speakers of the language of the set, the pronunciations also being in terms of subword units at least some of which are common to two or more of the languages. The method also includes training a single pronunciation estimator using data comprising the text spellings and the pronunciations of the training words.

    摘要翻译: 一种语音识别方法。 该方法使用单个发音估计器来训练声音音素模型并识别来自多种语言的语音。 该方法包括接受多组训练词中训练词的文本拼写,每组训练单词对应于多种语言中的不同语言。 该方法还包括对于多个训练词集合中的每一组,接收组中的训练单词的发音,发音是该组语言的母语者的特征,发音还以子单位 其中至少有一些是两种或多种语言的共同之处。 该方法还包括使用包括文本拼写和训练词的发音的数据训练单个发音估计器。

    Systems and methods for word recognition
    7.
    发明授权
    Systems and methods for word recognition 失效
    词识别的系统和方法

    公开(公告)号:US5680511A

    公开(公告)日:1997-10-21

    申请号:US477287

    申请日:1995-06-07

    IPC分类号: G10L15/18 G10L9/00

    CPC分类号: G10L15/1815

    摘要: In one aspect, the invention provides word recognition systems that operate to recognize an unrecognized or ambiguous word that occurs within a passage of words. The system can offer several words as choice words for inserting into the passage to replace the unrecognized word. The system can select the best choice word by using the choice word to extract from a reference source, sample passages of text that relate to the choice word. For example, the system can select the dictionary passage that defines the choice word. The system then compares the selected passage to the current passage, and generates a score that indicates the likelihood that the choice word would occur within that passage of text. The system can select the choice word with the best score to substitute into the passage. The passage of words being analyzed can be any word sequence including an utterance, a portion of handwritten text, a portion of typewritten text or other such sequence of words, numbers and characters. Alternative embodiments of the present invention are disclosed which function to retrieve documents from a library as a function of context.

    摘要翻译: 在一个方面,本发明提供了操作以识别在单词通过内出现的未识别或不明确的单词的单词识别系统。 该系统可以提供多个单词作为选择单词,用于插入到段落中以替换未被识别的单词。 系统可以通过使用选择单词从参考源中提取出最佳选择单词,与选择单词相关的文本的样本段落。 例如,系统可以选择定义选择字的字典通道。 然后,系统将所选择的段落与当前段落进行比较,并生成一个分数,指示选择单词在文本段落内发生的可能性。 系统可以选择具有最佳分数的选择词来代替段落。 正在分析的单词的通过可以是包括发音,手写文本的一部分,打字文本的一部分或其他这样的单词,数字和字符序列的任何单词序列。 公开了本发明的替代实施例,其功能是根据上下文从库中检索文档。

    Training and using pronunciation guessers in speech recognition
    8.
    发明授权
    Training and using pronunciation guessers in speech recognition 有权
    在语音识别中训练和使用发音猜测器

    公开(公告)号:US07467087B1

    公开(公告)日:2008-12-16

    申请号:US10684135

    申请日:2003-10-10

    IPC分类号: G10L13/00 G10L15/00 G10L15/06

    摘要: The error rate of a pronunciation guesser that guesses the phonetic spelling of words used in speech recognition is improved by causing its training to weigh letter-to-phoneme mappings used as data in such training as a function of the frequency of the words in which such mappings occur. Preferably the ratio of the weight to word frequency increases as word frequencies decreases. Acoustic phoneme models for use in speech recognition with phonetic spellings generated by a pronunciation guesser that makes errors are trained against word models whose phonetic spellings have been generated by a pronunciation guesser that makes similar errors. As a result, the acoustic models represent blends of phoneme sounds that reflect the spelling errors made by the pronunciation guessers. Speech recognition enabled systems are made by storing in them both a pronunciation guesser and a corresponding set of such blended acoustic models.

    摘要翻译: 猜测语音识别中使用的单词的拼音拼写的发音猜测器的错误率通过使其训练来衡量用作这种训练中的数据的字母到音素映射,作为其中这样的单词的频率的函数 映射发生。 优选地,权重与字频率的比率随着字频率的降低而增加。 用于语音识别的声学音素模型,由发音猜测器产生的语音拼写用于产生错误的声音拼音针对由发音猜测器产生类似错误的语音拼写的单词模型。 结果,声学模型表示声音发音的混合,反映了发音猜测者的拼写错误。 通过在其中存储发音猜测器和相应的一组这样的混合声学模型来进行支持语音识别的系统。

    Large-vocabulary continuous speech prefiltering and processing system
    9.
    发明授权
    Large-vocabulary continuous speech prefiltering and processing system 失效
    大型音频连续语音预处理和处理系统

    公开(公告)号:US5202952A

    公开(公告)日:1993-04-13

    申请号:US542520

    申请日:1990-06-22

    IPC分类号: G10L15/08 G10L15/10 G10L15/28

    CPC分类号: G10L15/10

    摘要: A continuous speech prefiltering system for use in continuous speech recognition computer systems. The speech to be recognized is converted from utterances to frame data sets, which frame data sets are smoothed to generate a smooth frame model over a predetermined number of frames. A resident vocabulary is stored within the computer as clusters of word models which are acoustically similar over a succession of frame periods. A cluster score is generated by the system, which score includes the likelihood of the smooth frames evaluated using a probability model for the cluster against which the smooth frame model is being compared. Cluster sets having cluster scores below a predetermined acoustic threshold are removed from further consideration. The remaining cluster sets are unpacked for determination of a word score for each unpacked word. These word scores are used to identify those words which are above a second predetermined threshold to define a word list which is sent to a recognizer for a more lengthy word match. A controller enables the system to initialize times corresponding to the frame start time for each frame data set, defining a sliding window.

    摘要翻译: 一种用于连续语音识别计算机系统的连续语音预过滤系统。 要识别的语音从语音转换为帧数据集,该帧数据集被平滑以在预定数量的帧上生成平滑帧模型。 驻留词汇被存储在计算机内,作为在一系列帧周期中在声学上相似的单词模型的群集。 由系统产生聚类分数,该分数包括使用针对平滑帧模型进行比较的群集的概率模型评估的平滑帧的可能性。 具有低于预定声学阈值的聚类分数的聚类集从进一步的考虑中被去除。 剩下的集群集被解包以确定每个未打包的单词的单词得分。 这些单词分数用于识别高于第二预定阈值的那些单词以定义一个单词列表,该单词列表被发送到识别器以获得更长的单词匹配。 控制器使得系统可以初始化与帧开始时间对应的时间,从而定义滑动窗口。

    Pronunciation discovery for spoken words
    10.
    发明授权
    Pronunciation discovery for spoken words 有权
    口语发音发现

    公开(公告)号:US08577681B2

    公开(公告)日:2013-11-05

    申请号:US10939942

    申请日:2004-09-13

    IPC分类号: G10L15/06 G10L15/00

    摘要: A method of generating an alternative pronunciation for a word or phrase, given an initial pronunciation and a spoken example of the word or phrase, includes providing the initial pronunciation of the word or phrase, and generating the alternative pronunciation by searching a neighborhood of pronunciations about the initial pronunciation via a constrained hypothesis, wherein the neighborhood includes pronunciations that differ from the initial pronunciation by at most one phoneme. The method further includes selecting a highest scoring pronunciation within the neighborhood of pronunciations.

    摘要翻译: 给定一个单词或短语的替代发音的方法,给定一个单词或短语的初始发音和口语例子,包括提供单词或短语的初始发音,并通过搜索关于发音的邻域发生替代发音 通过约束假设的初始发音,其中所述邻域包括与最初一个音素的初始发音不同的发音。 该方法还包括在发音附近选择最高的评分发音。