Method and apparatus for reducing false rejection in a speech
recognition system
    1.
    发明授权
    Method and apparatus for reducing false rejection in a speech recognition system 失效
    用于减少语音识别系统中的假拒绝的方法和装置

    公开(公告)号:US5845245A

    公开(公告)日:1998-12-01

    申请号:US753605

    申请日:1996-11-27

    CPC classification number: G10L15/065 G10L15/18 G10L2015/0633

    Abstract: The invention relates to a method and apparatus for grouping orthographies in a speech recognition dictionary to reduce false rejection. In a typical speech recognition system, the process of speech recognition consists of scanning the vocabulary database by using a fast match algorithm to find the top N orthography groups. In a second pass the orthographies in the selected groups are re-scored using more precise likelihoods. The top orthographies in the top two groups are then processed by a rejection algorithm to find if they are sufficiently distinct from one another. In the affirmative the top choice candidate is accepted, otherwise the entire procedure is terminated. The novel method comprises the steps of grouping confusable orthographies together to reduce the possibility of false rejection.

    Abstract translation: 本发明涉及一种用于在语音识别词典中分组正字法以减少错误拒绝的方法和装置。 在典型的语音识别系统中,语音识别过程包括通过使用快速匹配算法来扫描词汇数据库,找到前N个正交组。 在第二次通过中,使用更精确的可能性对选定组中的正字法进行重新评分。 然后通过拒绝算法处理前两组中的顶部正字,以发现它们是否彼此是否足够不同。 在肯定的情况下,首选候选人被接受,否则整个程序终止。 该新颖方法包括将可混淆的正字拼接在一起以减少错误拒绝的可能性的步骤。

    Search optimization system and method for continuous speech recognition
    2.
    发明授权
    Search optimization system and method for continuous speech recognition 失效
    搜索优化系统和连续语音识别方法

    公开(公告)号:US06397179B2

    公开(公告)日:2002-05-28

    申请号:US09185529

    申请日:1998-11-04

    CPC classification number: G10L15/1815 G10L2015/085

    Abstract: A system and method for continuous speech recognition (CSR) is optimized to reduce processing time for connected word grammars bounded by semantically null words. The savings, which reduce processing time both during the forward and the backward passes of the search, as well as during rescoring, are achieved by performing only the minimal amount of computation required to produce an exact N-best list of semantically meaningful words (N-best list of salient words). This departs from the standard Spoken Language System modeling which any notion of meaning is handled by the Natural Language Understanding (NLU) component. By expanding the task of the recognizer component from a simple acoustic match to allow semantic information to be fed to the recognizer, significant processing time savings are achieved, and make it possible to run an increased number of speech recognition channels in parallel for improved performance, which may enhance users perception of value and quality of service.

    Abstract translation: 用于连续语音识别(CSR)的系统和方法被优化以减少由语义空字界定的连接词语法的处理时间。 在搜索的正向和反向遍历期间以及在解密期间减少处理时间的节省通过仅执行产生语义有意义的单词的精确N最佳列表所需的最小量的计算来实现(N 最显着的词汇表)。 这偏离了标准语言系统建模,任何意义的概念都由自然语言理解(NLU)组成。 通过从简单的声学匹配扩展识别器组件的任务以允许将语义信息馈送到识别器,实现显着的处理时间节省,并且使得可以并行运行更多数量的语音识别信道以提高性能, 这可能会增强用户的价值观和服务质量。

    Speech recognition rejection method using generalized additive models
    3.
    发明授权
    Speech recognition rejection method using generalized additive models 失效
    使用广义加法模型的语音识别拒绝方法

    公开(公告)号:US6006182A

    公开(公告)日:1999-12-21

    申请号:US934892

    申请日:1997-09-22

    CPC classification number: G10L15/142

    Abstract: Systems and methods consistent with the present invention determine whether to accept one of a plurality of intermediate recognition results output by a speech recognition system as a final recognition result. The system first combines a plurality of speech rejection features into a feature function in which weights are assigned to each rejection feature in accordance with a recognition accuracy of each rejection feature. Feature values are then calculated for each of the rejection features using the plurality of intermediate recognition results. The system next computes the feature function according to the calculated feature values to determine a rejection decision value. Finally, one of the plurality of intermediate recognition results is accepted as the final recognition result according to the rejection decision value.

    Abstract translation: 与本发明一致的系统和方法确定是否接受由语音识别系统输出的多个中间识别结果中的一个作为最终识别结果。 该系统首先将多个语音抑制特征组合成特征功能,其中根据每个拒绝特征的识别精度将权重分配给每个拒绝特征。 然后使用多个中间识别结果为每个拒绝特征计算特征值。 系统接下来根据计算的特征值计算特征函数以确定拒绝判定值。 最后,多个中间识别结果之一被接受为根据拒绝判定值的最终识别结果。

    Method and apparatus for training a multilingual speech model set
    4.
    发明授权
    Method and apparatus for training a multilingual speech model set 失效
    用于训练多语言语音模型集的方法和装置

    公开(公告)号:US06912499B1

    公开(公告)日:2005-06-28

    申请号:US09386282

    申请日:1999-08-31

    CPC classification number: G10L15/063 G10L15/187

    Abstract: The invention relates to a method and apparatus for training a multilingual speech model set. The multilingual speech model set generated is suitable for use by a speech recognition system for recognizing spoken utterances for at least two different languages. The invention allows using a single speech recognition unit with a single speech model set to perform speech recognition on utterances from two or more languages. The method and apparatus make use of a group of a group of acoustic sub-word units comprised of a first subgroup of acoustic sub-word units associated to a first language and a second subgroup of acoustic sub-word units associated to a second language where the first subgroup and the second subgroup share at least one common acoustic sub-word unit. The method and apparatus also make use of a plurality of letter to acoustic sub-word unit rules sets, each letter to acoustic sub-word unit rules set being associated to a different language. A set of untrained speech models is trained on the basis of a training set comprising speech tokens and their associated labels in combination with the group of acoustic sub-word units and the plurality of letter to acoustic sub-word unit rules sets. The invention also provides a computer readable storage medium comprising a program element for implementing the method for training a multilingual speech model set.

    Abstract translation: 本发明涉及用于训练多语言语音模型集合的方法和装置。 所生成的多语言语音模型集合适合于语音识别系统用于识别至少两种不同语言的语音话语。 本发明允许使用具有单个语音模型集的单个语音识别单元来执行来自两种或多种语言的话语的语音识别。 所述方法和装置利用由与第一语言相关联的声学子词单元的第一子组和与第二语言相关联的声学子单元的第二子组组成的一组声学子词单元 第一子组和第二子组共享至少一个公共声音子字单元。 该方法和装置还利用多个字母到声学子单元规则集合,每个字母与声学子单元规则集合与不同语言相关联。 基于包括语音令牌及其相关联的标签的训练集合,训练一组未训练的语音模型,该组合与声学子单元单元和多个字母到声学子单元规则集的组合。 本发明还提供了一种计算机可读存储介质,其包括用于实现用于训练多语言语音模型集合的方法的程序单元。

    Search and rescoring method for a speech recognition system
    5.
    发明授权
    Search and rescoring method for a speech recognition system 失效
    语音识别系统的搜索和解密方法

    公开(公告)号:US06253178B1

    公开(公告)日:2001-06-26

    申请号:US08934736

    申请日:1997-09-22

    CPC classification number: G10L15/08 G10L15/142 G10L2015/085

    Abstract: Speech recognition systems and methods consistent with the present invention process input speech signals organized into a series of frames. The input speech signal is decimated to select K frames out of every L frames of the input speech signal according to a decimation rate K/L. A first set of model distances is then calculated for each of the K selected frames of the input speech signal, and a Hidden Markov Model (HMM) topology of a first set of models is reduced according to the decimation rate K/L. The system then selects a reduced set of model distances from the computed first set of model distances according to the reduced HMM topology and selects a first plurality of candidate choices for recognition according to the reduced set of model distances. A second set of model distances is computed, using a second set of models, for a second plurality of candidate choices, wherein the second plurality of candidate choices correspond to at least a subset of the first plurality of candidate choices. The second plurality of candidate choices are rescored using the second set of model distances, and a recognition result is selected from the second plurality of candidate choices according to the rescored second plurality of candidate choices.

    Abstract translation: 与本发明一致的语音识别系统和方法将输入的语音信号组合成一系列帧。 抽取输入语音信号,根据抽取率K / L从输入语音信号的每L帧中选择K个帧。 然后针对输入语音信号的K个选择的帧中的每一个计算第一组模型距离,并且根据抽取率K / L减少第一组模型的隐马尔可夫模型(HMM)拓扑。 然后,系统根据减小的HMM拓扑从所计算的第一组模型距离中选择一组缩减的模型距离,并根据缩小的模型距离集合选择用于识别的第一多个候选选择。 对于第二多个候选选择,使用第二组模型来计算第二组模型距离,其中第二多个候选选择对应于第一多个候选选择的至少一个子集。 使用第二组模型距离来重新获得第二多个候选选择,并且根据重新获得的第二多个候选选择从第二多个候选选择中选择识别结果。

Patent Agency Ranking