System and methods for acoustic and language modeling for automatic speech recognition with large vocabularies
    1.
    发明授权
    System and methods for acoustic and language modeling for automatic speech recognition with large vocabularies 有权
    用于具有大词汇的自动语音识别的声学和语言建模的系统和方法

    公开(公告)号:US07801727B2

    公开(公告)日:2010-09-21

    申请号:US11064643

    申请日:2005-02-24

    IPC分类号: G10L15/04

    摘要: A method for generating a language component vocabulary VC for a speech recognition system having a language vocabulary V of a plurality of word forms is disclosed. The method includes: partitioning the language vocabulary V into subsets of word forms based on frequencies of occurrence of the respective word forms; and in at least one of the subsets, splitting word forms having frequencies less than a threshold to thereby generate word form components. Also disclosed is a method for use in speech recognition including: splitting an acoustic vocabulary comprising baseforms into baseform components and storing the baseform components; and, performing sound to spelling mapping on the baseform components so as to generate a baseform components to word parts table for use in subsequent decoding of speech. A method for decoding a speech utterance using language model components and acoustic components, includes the steps of: generating from the utterance a stack of baseform component paths; concatenating baseform components in a path to generate concatenated baseforms, when the concatenated baseform components correspond to a baseform found in an acoustic vocabulary; mapping the concatenated baseforms into words; computing language model (LM) scores associated with the words using a language model, and performing further decoding of the utterance based thereupon.

    摘要翻译: 公开了一种用于生成具有多个单词形式的语言词汇V的语音识别系统的语言组件词汇VC的方法。 该方法包括:基于各个词形式的出现频率将语言词汇V划分成单词形式的子集; 并且在至少一个子集中,分割具有小于阈值的频率的字形式,从而生成词形分量。 还公开了一种用于语音识别的方法,包括:将包含基本形式的声学词汇分解成基本形式组件并存储基本形式组件; 并且对基本形式组件执行声音拼写映射,以便生成用于语音后续解码中的字部分表的基本形式分量。 一种使用语言模型分量和声学分量对语音发音进行解码的方法,包括以下步骤:从发音中产生一叠基础分量路径; 当级联的基本形式组件对应于在声学词汇中发现的基础形式时,将路径中的基本形式组件连接以生成级联的基本形式; 将连接的基本形式映射为单词; 与使用语言模型的单词相关联的计算语言模型(LM)得分,并且基于此进行对话语的进一步解码。

    System and methods for acoustic and language modeling for automatic speech recognition with large vocabularies
    2.
    发明授权
    System and methods for acoustic and language modeling for automatic speech recognition with large vocabularies 有权
    用于具有大词汇的自动语音识别的声学和语言建模的系统和方法

    公开(公告)号:US06928404B1

    公开(公告)日:2005-08-09

    申请号:US09271469

    申请日:1999-03-17

    摘要: Systems and methods are provided for generating a language component vocabulary VC for a speech recognition system having a language vocabulary V of a plurality of word forms. One method for generating a language component vocabulary VC for a speech recognition system having a language vocabulary V of a plurality of word forms includes partitioning the language vocabulary V into subsets of word forms based on frequencies of occurrence of the respective word forms, in at least one the subsets, splitting word forms having frequencies less than a threshold to thereby generate word form components and generating a language component vocabulary VC including word forms and word form components. The resulting language component vocabulary, which includes word forms and word components, is used to generate a language model that can be efficiently implemented for real-time automatic speech recognition applications for languages with large vocabularies.

    摘要翻译: 提供了用于为具有多个单词形式的语言词汇V的语音识别系统生成语言组件词汇VC的系统和方法。 用于生成具有多个单词形式的语言词汇V的语音识别系统的语言组件词汇VC的一种方法包括至少基于各个单词形式的出现频率将语言词汇V划分成单词形式的子集 一个子集,分裂词形式具有小于阈值的频率,从而生成单词形式分量并生成包括单词形式和单词形式分量的语言组成词汇VC。 所产生的包括单词形式和单词组成的语言组件词汇用于生成语言模型,该语言模型可以有效地实现用于具有大词汇的语言的实时自动语音识别应用。

    Apparatus and method for forming a filtered inflected language model for
automatic speech recognition
    3.
    发明授权
    Apparatus and method for forming a filtered inflected language model for automatic speech recognition 失效
    用于形成用于自动语音识别的滤波变形语言模型的装置和方法

    公开(公告)号:US6073091A

    公开(公告)日:2000-06-06

    申请号:US906812

    申请日:1997-08-06

    CPC分类号: G10L15/197

    摘要: A method of forming a language model for a language having a selected vocabulary of word forms comprises: (a) mapping the word forms into integer vectors in accordance with frequencies of word form occurrence; (b) partitioning the integer vectors into subsets, the subsets respectively having ranges of frequencies of word form occurrence associated therewith, the subsets being arranged in a descending order of frequency ranges; (c) respectively assigning maps to the subsets; (d) filtering a textual corpora using the maps assigned to the subsets in order to generate indexed integers; (e) determining n-gram statistics for the indexed integers; and (f) estimating n-gram language model probabilities from the n-gram statistics to form the language model.

    摘要翻译: 一种形成具有所选词形的语言的语言模型的方法包括:(a)根据词形发生的频率将单词形式映射成整数向量; (b)将整数向量划分成子集,子集分别具有与其相关联的字形式出现的频率范围,子集以频率范围的降序排列; (c)分别将地图分配给子集; (d)使用分配给子集的映射过滤文本语料库,以生成索引整数; (e)确定索引整数的n-gram统计; 和(f)从n-gram统计量估计n-gram语言模型概率以形成语言模型。

    Methods and apparatus for generating dialog state conditioned language models
    5.
    发明授权
    Methods and apparatus for generating dialog state conditioned language models 有权
    用于生成对话状态条件语言模型的方法和装置

    公开(公告)号:US07542901B2

    公开(公告)日:2009-06-02

    申请号:US11509390

    申请日:2006-08-24

    IPC分类号: G10L15/06 G10L15/18

    摘要: Techniques are provided for generating improved language modeling. Such improved modeling is achieved by conditioning a language model on a state of a dialog for which the language model is employed. For example, the techniques of the invention may improve modeling of language for use in a speech recognizer of an automatic natural language based dialog system. Improved usability of the dialog system arises from better recognition of a user's utterances by a speech recognizer, associated with the dialog system, using the dialog state-conditioned language models. By way of example, the state of the dialog may be quantified as: (i) the internal state of the natural language understanding part of the dialog system; or (ii) words in the prompt that the dialog system played to the user.

    摘要翻译: 提供了用于生成改进的语言建模的技术。 通过对使用语言模型的对话框的状态调节语言模型来实现这种改进的建模。 例如,本发明的技术可以改进用于基于自动语言的对话系统的语音识别器中使用的语言建模。 对话系统的可用性的提高是由使用对话状态条件语言模型更好地识别与对话系统相关联的语音识别器的用户话语。 作为示例,对话的状态可以量化为:(i)对话系统的自然语言理解部分的内部状态; 或(ii)对话系统向用户播放的提示中的单词。

    Methods and Apparatus for Generating Dialog State Conditioned Language Models
    6.
    发明申请
    Methods and Apparatus for Generating Dialog State Conditioned Language Models 有权
    用于生成对话状态条件语言模型的方法和装置

    公开(公告)号:US20080215329A1

    公开(公告)日:2008-09-04

    申请号:US12057646

    申请日:2008-03-28

    IPC分类号: G10L15/28

    摘要: Techniques are provided for generating improved language modeling. Such improved modeling is achieved by conditioning a language model on a state of a dialog for which the language model is employed. For example, the techniques of the invention may improve modeling of language for use in a speech recognizer of an automatic natural language based dialog system. Improved usability of the dialog system arises from better recognition of a user's utterances by a speech recognizer, associated with the dialog system, using the dialog state-conditioned language models. By way of example, the state of the dialog may be quantified as: (i) the internal state of the natural language understanding part of the dialog system; or (ii) words in the prompt that the dialog system played to the user.

    摘要翻译: 提供了用于生成改进的语言建模的技术。 通过对使用语言模型的对话框的状态调节语言模型来实现这种改进的建模。 例如,本发明的技术可以改进用于基于自动语言的对话系统的语音识别器中使用的语言建模。 对话系统的可用性的提高源自使用对话状态条件语言模型更好地识别与对话系统相关联的语音识别器的用户话语。 作为示例,对话的状态可以量化为:(i)对话系统的自然语言理解部分的内部状态; 或(ii)对话系统向用户播放的提示中的单词。

    Methods and apparatus for generating dialog state conditioned language models
    9.
    发明授权
    Methods and apparatus for generating dialog state conditioned language models 有权
    用于生成对话状态条件语言模型的方法和装置

    公开(公告)号:US07853449B2

    公开(公告)日:2010-12-14

    申请号:US12057646

    申请日:2008-03-28

    IPC分类号: G10L15/06 G10L15/10 G10L15/18

    摘要: Techniques are provided for generating improved language modeling. Such improved modeling is achieved by conditioning a language model on a state of a dialog for which the language model is employed. For example, the techniques of the invention may improve modeling of language for use in a speech recognizer of an automatic natural language based dialog system. Improved usability of the dialog system arises from better recognition of a user's utterances by a speech recognizer, associated with the dialog system, using the dialog state-conditioned language models. By way of example, the state of the dialog may be quantified as: (i) the internal state of the natural language understanding part of the dialog system; or (ii) words in the prompt that the dialog system played to the user.

    摘要翻译: 提供了用于生成改进的语言建模的技术。 通过对使用语言模型的对话框的状态调节语言模型来实现这种改进的建模。 例如,本发明的技术可以改进用于基于自动语言的对话系统的语音识别器中使用的语言建模。 对话系统的可用性的提高是由使用对话状态条件语言模型更好地识别与对话系统相关联的语音识别器的用户话语。 作为示例,对话的状态可以量化为:(i)对话系统的自然语言理解部分的内部状态; 或(ii)对话系统向用户播放的提示中的单词。

    Apparatus and methods for identifying homophones among words in a speech recognition system
    10.
    发明授权
    Apparatus and methods for identifying homophones among words in a speech recognition system 有权
    用于在语音识别系统中识别单词之间的同音词的装置和方法

    公开(公告)号:US06269335B1

    公开(公告)日:2001-07-31

    申请号:US09134261

    申请日:1998-08-14

    IPC分类号: G10L2100

    CPC分类号: G10L15/22

    摘要: A method of identifying homophones of a word uttered by a user from at least a portion of existing words of a vocabulary of a speech recognition engine comprises the steps of: a user uttering the word; decoding the uttered word; computing respective measures between the decoded word and at least a portion of the other existing vocabulary words, the respective measures indicative of acoustic similarity between the word and the at least a portion of other existing words; if at least one measure is within a threshold range, indicating, to the user, results associated with the at least one measure, the results preferably including the decoded word and the other existing vocabulary word associated with the at least one measure; and the user preferably making a selection depending on the word the user intended to utter.

    摘要翻译: 从语音识别引擎的词汇表的现有单词的至少一部分中识别用户发出的单词的同音词的方法包括以下步骤:用户说出该单词; 解码发音字; 计算解码字与至少一部分其他现有词汇词之间的相应度量,所述各个度量指示词与其他现有词的至少一部分之间的声学​​相似性; 如果至少一个度量在阈值范围内,则向用户指示与至少一个度量相关联的结果,结果优选地包括与所述至少一个度量相关联的解码词和其他现有词汇单; 并且用户优选地根据用户想要发出的词进行选择。