Expanding an effective vocabulary of a speech recognition system
    1.
    发明授权
    Expanding an effective vocabulary of a speech recognition system 有权
    扩展语音识别系统的有效词汇

    公开(公告)号:US07120582B1

    公开(公告)日:2006-10-10

    申请号:US09390370

    申请日:1999-09-07

    IPC分类号: G10L15/00 G10L15/06

    摘要: The invention provides techniques for creating and using fragmented word models to increase the effective size of an active vocabulary of a speech recognition system. The active vocabulary represents all words and word fragments that the speech recognition system is able to recognize. Each word may be represented by a combination of acoustic models. As such, the active vocabulary represents the combinations of acoustic models that the speech recognition system may compare to a user's speech to identify acoustic models that best match the user's speech. The effective size of the active vocabulary may be increased by dividing words into constituent components or fragments (for example, prefixes, suffixes, separators, infixes, and roots) and including each component as a separate entry in the active vocabulary. Thus, for example, a list of words and their plural forms (for example, “book, books, cook, cooks, hook, hooks, look and looks”) may be represented in the active vocabulary using the words (for example, “book, cook, hook and look”) and an entry representing the suffix that makes the words plural (for example, “+s”, where the “+” preceding the “s” indicates that “+s” is a suffix). For a large list of words, and ignoring the entry associated with the suffix, this technique may reduce the number of vocabulary entries needed to represent the list of words considerably.

    摘要翻译: 本发明提供了用于创建和使用分割词模型以增加语音识别系统的活跃词汇表的有效大小的技术。 活动词汇表示语音识别系统能够识别的所有单词和单词片段。 每个单词可以由声学模型的组合来表示。 因此,活动词汇表示声学模型的组合,语音识别系统可以与用户的语音进行比较,以识别与用户的语音最匹配的声学模型。 活动词汇表的有效大小可以通过将单词划分成组成组件或片段(例如,前缀,后缀,分隔符,中缀和根)并将每个组件作为活动词汇表中的单独条目来增加。 因此,例如,可以在活动词汇表中使用单词(例如,“书籍,书籍,烹饪,烹饪,钩子,钩子,外观和外观”)的单词列表及其复数形式 书签,烹饪,钩子和外观“)和表示使单词复数的后缀的条目(例如,”+ s“,其中”+“之前的”+“表示”+ s“是后缀)。 对于大量单词列表,忽略与后缀相关联的条目,这种技术可能会大大减少用于表示单词列表所需的词汇表数量。