Smart training and smart scoring in SD speech recognition system with user defined vocabulary
    71.
    发明授权
    Smart training and smart scoring in SD speech recognition system with user defined vocabulary 有权
    SD语音识别系统智能训练和智能评分,具有用户定义的词汇量

    公开(公告)号:US06535850B1

    公开(公告)日:2003-03-18

    申请号:US09522448

    申请日:2000-03-09

    Applicant: Aruna Bayya

    Inventor: Aruna Bayya

    CPC classification number: G10L15/07 G10L2015/0635

    Abstract: In a speech training and recognition system, the current invention detects and warns the user about the similar sounding entries to vocabulary and permits entry of such confusingly similar terms which are marked along with the stored similar terms to identify the similar words. In addition, the states in similar words are weighted to apply more emphasis to the differences between similar words than the similarities of such words. Another aspect of the current invention is to use modified scoring algorithm to improve the recognition performance in the case where confusing entries were made to the vocabulary despite the warning. Yet another aspect of the current invention is to detect and warn the user about potential problems with new entries such as short words and two or more word entries with long silence periods in between words. Finally, the current invention also includes alerting the user about the dissimilarity of the multiple tokens of the same vocabulary item in the case of multiple-token training.

    Abstract translation: 在语音训练和识别系统中,本发明检测并警告用户关于词汇的类似声音条目,并允许输入与所存储的类似术语一起标记的这种混淆相似的术语以识别类似词。 另外,加权相似词语的国家更多地强调类似词之间的差异,而不是这些词的相似之处。 本发明的另一方面是使用修改后的评分算法来提高识别性能,即使在警告情况下对词汇进行了混淆。 本发明的另一方面是检测并警告用户关于新词条的潜在问题,例如短单词和两个或更多个单词之间具有长静默期的词条。 最后,本发明还包括在多令牌训练的情况下向用户提醒关于相同词汇项的多个令牌的不相似性。

    Automated centralized updating of speech recognition systems
    73.
    发明授权
    Automated centralized updating of speech recognition systems 有权
    语音识别系统的自动集中更新

    公开(公告)号:US06456975B1

    公开(公告)日:2002-09-24

    申请号:US09482738

    申请日:2000-01-13

    CPC classification number: G10L15/30 G10L2015/0631 G10L2015/0635

    Abstract: In one embodiment, a speech recognition program at a client receives data that is unrecognized, such as an unrecognized word, an unrecognized pronunciation of a known word, an unrecognized dialect of a known, and/or a substantially new word frequency usage. The client transmits the data to a provider, which processes the data into known data, and transmits the known data back to a number of clients, including the client that initially sent the unrecognized data. In one embodiment, the unrecognized data is sent from the client to the provider via a third party, to anonymize the data.

    Abstract translation: 在一个实施例中,客户端处的语音识别程序接收无法识别的数据,例如无法识别的字,已知字的无法识别的发音,已知的未被识别的方言和/或基本上新的字频率使用。 客户端将数据发送到提供商,提供商将数据处理为已知数据,并将已知数据发送回多个客户端,包括最初发送无法识别的数据的客户端。 在一个实施例中,无法识别的数据经由第三方从客户端发送到提供商,以对数据进行匿名化。

    Smart correction of dictated speech
    74.
    发明授权
    Smart correction of dictated speech 有权
    明智的矫正言辞

    公开(公告)号:US06418410B1

    公开(公告)日:2002-07-09

    申请号:US09406661

    申请日:1999-09-27

    CPC classification number: G10L15/183 G10L2015/0635

    Abstract: In a speech recognition system, a method and system for updating a language model during a correction session can include automatically comparing dictated text to replacement text, determining if the replacement text is on an alternative word list if the comparison is close enough to indicate that the replacement text represents correction of a mis-recognition error rather than an edit, and updating the language model without user interaction if the replacement text is on the alternative word list. If the replacement text is not on the alternative word list, a comparison is made between dictated word digital information and replacement word digital information, and the language model is updated if the digital comparison is close enough to indicate that the replacement text represents correction of a mis-recognition error rather than an edit.

    Abstract translation: 在语音识别系统中,用于在校正会话期间更新语言模型的方法和系统可以包括:自动地将指定的文本与替换文本进行比较,如果比较足够接近以确定替换文本是否在备选单词列表上, 替换文本表示错误识别错误而不是编辑的更正,如果替换文本位于替代单词列表上,则更新语言模型而无需用户交互。 如果替代文本不在替代单词列表上,则在指定词数字信息和替换字数字信息之间进行比较,并且如果数字比较足够接近以指示替换文本表示校正 错误识别错误而不是编辑。

    Process for the multilingual use of a hidden markov sound model in a speech recognition system
    75.
    发明授权
    Process for the multilingual use of a hidden markov sound model in a speech recognition system 有权
    在语音识别系统中多语言使用隐马尔科夫声音模型的过程

    公开(公告)号:US06212500B1

    公开(公告)日:2001-04-03

    申请号:US09254775

    申请日:1999-03-09

    Inventor: Joachim Köhler

    Abstract: In a method for determining the similarities of sounds across different languages, hidden Markov modelling of multilingual phonemes is employed wherein language-specific as well as language-independent properties are identified by combining of the probability densities for different hidden Markov sound models in various languages.

    Abstract translation: 在用于确定不同语言之间的声音的相似性的方法中,使用多语言音素的隐马尔可夫模型,其中通过将各种语言的不同隐马尔可夫语音模型的概率密度组合来识别语言特定以及与语言无关的属性。

    Method for reducing database requirements for speech recognition systems
    76.
    发明授权
    Method for reducing database requirements for speech recognition systems 失效
    降低语音识别系统数据库要求的方法

    公开(公告)号:US5845246A

    公开(公告)日:1998-12-01

    申请号:US396018

    申请日:1995-02-28

    Inventor: Thomas B. Schalk

    CPC classification number: G10L15/065 G10L2015/0635

    Abstract: The present invention comprises a method for reducing the database requirements necessary for use in speaker independent recognition systems. The method involves digital processing of a plurality of recorded utterances from a first database of digitally recorded spoken utterances. The previously recorded utterances are digitally processed to create a second database of modified utterances and then the first and second databases are combined to form an expanded database from which recognition vocabulary tables may be generated.

    Abstract translation: 本发明包括一种减少用于说话者独立识别系统所需的数据库要求的方法。 该方法涉及从数字记录的口头语音的第一数据库对多个记录的话语进行数字处理。 先前记录的话语被数字处理以创建经修改的话语的第二数据库,然后组合第一和第二数据库以形成扩展的数据库,从该数据库可以生成识别词汇表。

    Real-time reconfigurable adaptive speech recognition command and control
apparatus and method
    77.
    发明授权
    Real-time reconfigurable adaptive speech recognition command and control apparatus and method 失效
    实时可重构自适应语音识别命令和控制装置及方法

    公开(公告)号:US5774841A

    公开(公告)日:1998-06-30

    申请号:US536302

    申请日:1995-09-20

    CPC classification number: G10L15/22 G10L15/06 G10L2015/0631 G10L2015/0635

    Abstract: An adaptive speech recognition and control system and method for controlling various mechanisms and systems in response to spoken instructions and in which spoken commands are effective to direct the system into appropriate memory nodes, and to respective appropriate memory templates corresponding to the voiced command. Spoken commands from any of a group of operators for which the system is trained may be identified, and voice templates are updated as required in response to changes in pronunciation and voice characteristics over time of any of the operators for which the system is trained. Provisions are made for both near-real-time retraining of the system with respect to individual terms which are determined not be positively identified, and for an overall system training and updating process in which recognition of each command and vocabulary term is checked, and in which the memory templates are retrained if necessary for respective commands or vocabulary terms with respect to an operator currently using the system. In one embodiment, the system includes input circuitry connected to a microphone and including signal processing and control sections for sensing the level of vocabulary recognition over a given period and, if recognition performance falls below a given level, processing audio-derived signals for enhancing recognition performance of the system.

    Abstract translation: 一种用于响应于语音指令来控制各种机制和系统的自适应语音识别和控制系统和方法,并且其中口头命令有效地将系统引导到适当的存储器节点以及对应于有声命令的相应适当的存储器模板。 可以识别来自系统被训练的一组操作者的口令命令,并且响应随着时间的任何训练系统的操作者的发音和语音特征的变化,语音模板根据需要被更新。 规定了系统对于不被确定的个别术语的近实时再培训,以及对整个系统的训练和更新过程,其中检查每个命令和词汇术语的识别,并且在 如果需要,对于相对于当前使用该系统的操作者的命令或词汇术语,对内存模板进行再培训。 在一个实施例中,该系统包括连接到麦克风的输入电路,并且包括信号处理和控制部分,用于感测给定时间段上的词汇识别水平,如果识别性能低于给定级别,则处理音频导出信号以增强识别 系统的性能。

    Apparatus and method for normalizing and categorizing linear prediction
code vectors using Bayesian categorization technique
    78.
    发明授权
    Apparatus and method for normalizing and categorizing linear prediction code vectors using Bayesian categorization technique 失效
    使用贝叶斯分类技术对线性预测码矢量进行归一化和分类的装置和方法

    公开(公告)号:US5704004A

    公开(公告)日:1997-12-30

    申请号:US786551

    申请日:1997-01-21

    CPC classification number: G10L15/02 G10L2015/0635 G10L25/12 G10L25/24

    Abstract: The present invention discloses a pattern matching system applicable for syllable recognition which includes a dictionary means for storing a plurality of standard patterns each representing a standard syllable by at least a syllable feature. The pattern matching system further includes a converting means for converting an input pattern representing an unknown syllable into a categorizing pattern for representing the unknown syllable in the syllable features used for representing the standard syllables. The pattern matching system further includes a Bayesian categorizing means for matching the standard pattern representing the standard syllable and the categorizing pattern representing the unknown syllable for computing a Bayesian mis-categorization risk for each of the standard syllables, the Bayesian categorization means further including a comparing and identification means for selecting a standard syllable which has the least mis-categorization risk as an identified syllable for the input unknown syllable.

    Abstract translation: 本发明公开了一种适用于音节识别的模式匹配系统,其包括字典装置,用于存储多个标准模式,每个标准模式通过至少一个音节特征代表标准音节。 模式匹配系统还包括转换装置,用于将表示未知音节的输入模式转换成用于表示用于表示标准音节的音节特征中的未知音节的分类模式。 模式匹配系统还包括贝叶斯分类装置,用于匹配表示标准音节的标准模式和代表用于计算每个标准音节的贝叶斯误分类风险的未知音节的分类模式,贝叶斯分类装置还包括比较 以及识别装置,用于选择具有最小错误分类风险的标准音节作为输入未知音节的确定音节。

    Speech recognition system allows new vocabulary words to be added
without requiring spoken samples of the words
    79.
    发明授权
    Speech recognition system allows new vocabulary words to be added without requiring spoken samples of the words 失效
    语音识别系统允许添加新的词汇单词,而不需要语言样本

    公开(公告)号:US5623578A

    公开(公告)日:1997-04-22

    申请号:US144961

    申请日:1993-10-28

    CPC classification number: G10L15/063 G10L2015/0635 G10L2015/0638

    Abstract: A speech recognition method implemented in a computer system recognizes words without requiring prior creation of models for such words based on spoken entries. A key word is entered in nonspoken form and a string of phonemes are defined by the speech recognizer to represent the new key word. A response signal is generated from each phoneme in the new key word model. Such response signals are utilized to define a multidimensional validity field for the new key word. Upon receipt of a spoken word from a user, a string of phonemes is assigned to represent the spoken word. A response signal from each phoneme in the model used to represent the spoken word is contrasted with the validity fields previously defined for the corresponding key word. A determination is made as to whether the spoken word is valid or not based on whether the response signals representing the spoken word lie within the validity fields.

    Abstract translation: 在计算机系统中实现的语音识别方法识别单词,而不需要基于语音输入事先创建这样的单词的模型。 一个关键词以非说明形式输入,并且一串音素由语音识别器定义以表示新的关键词。 从新的关键字模型中的每个音素生成响应信号。 这样的响应信号用于定义新关键字的多维有效性字段。 在从用户接收到口语单词时,分配一串音素来表示口语单词。 来自用于表示口语单词的模型中的每个音素的响应信号与先前为相应的关键词定义的有效性字段形成对比。 根据表示口语字的响应信号是否在有效性字段内,确定口语字是否有效。

    Automatic recognition of a consistent message using multiple
complimentary sources of information
    80.
    发明授权
    Automatic recognition of a consistent message using multiple complimentary sources of information 失效
    使用多个免费信息来自动识别一致的消息

    公开(公告)号:US5502774A

    公开(公告)日:1996-03-26

    申请号:US300232

    申请日:1994-09-06

    CPC classification number: G06K9/6293 G10L15/24 G10L15/10 G10L2015/0635

    Abstract: A general approach is provided for the combined use of several sources of information in the automatic recognition of a consistent message. For each message unit (e.g., word) the total likelihood score is assumed to be the weighted sum of the likelihood scores resulting from the separate evaluation of each information source. Emphasis is placed on the estimation of weighing factors used in forming this total likelihood. This method can be applied, for example, to the decoding of a consistent message using both handwriting and speech recognition. The present invention includes three procedures which provide the optimal weighing coefficients.

    Abstract translation: 提供了一种通用方法,用于在一致的消息的自动识别中组合使用多种信息源。 对于每个消息单元(例如,单词),总概率分数被假设为由每个信息源的单独评估得到的似然分数的加权和。 强调用于形成这种总可能性的称重因子的估计。 该方法例如可以应用于使用手写和语音识别两者的一致消息的解码。 本发明包括提供最佳称重系数的三个步骤。

Patent Agency Ranking