Indexing and searching speech with text meta-data
    1.
    发明申请
    Indexing and searching speech with text meta-data 有权
    用文本元数据索引和搜索语音

    公开(公告)号:US20070106509A1

    公开(公告)日:2007-05-10

    申请号:US11269872

    申请日:2005-11-08

    IPC分类号: G10L15/00

    摘要: An index for searching spoken documents having speech data and text meta-data is created by obtaining probabilities of occurrence of words and positional information of the words of the speech data and combining it with at least positional information of the words in the text meta-data. A single index can be created because the speech data and the text meta-data are treated the same and considered only different categories.

    摘要翻译: 用于搜索具有语音数据和文本元数据的口头文档的索引是通过获得单词的发生概率和语音数据的单词的位置信息并将其与文本元数据中的单词的至少位置信息进行组合来创建的 。 可以创建单个索引,因为语音数据和文本元数据被视为相同,仅被认为是不同的类别。

    Speech index pruning
    2.
    发明申请
    Speech index pruning 有权
    语音索引修剪

    公开(公告)号:US20070106512A1

    公开(公告)日:2007-05-10

    申请号:US11270673

    申请日:2005-11-09

    IPC分类号: G10L13/08

    摘要: A speech segment is indexed by identifying at least two alternative word sequences for the speech segment. For each word in the alternative sequences, information is placed in an entry for the word in the index. Speech units are eliminated from entries in the index based on a comparison of a probability that the word appears in the speech segment and a threshold value.

    摘要翻译: 通过识别用于语音段的至少两个备选词序列来索引语音片段。 对于替代序列中的每个单词,信息被放置在索引中的单词的条目中。 基于词出现在语音片段中的概率与阈值的比较,从索引中的条目中消除语音单元。

    Discriminative training of language models for text and speech classification
    3.
    发明授权
    Discriminative training of language models for text and speech classification 有权
    文本和语言分类语言模型的歧视性训练

    公开(公告)号:US08306818B2

    公开(公告)日:2012-11-06

    申请号:US12103035

    申请日:2008-04-15

    IPC分类号: G10L15/00 G06F17/27

    摘要: Methods are disclosed for estimating language models such that the conditional likelihood of a class given a word string, which is very well correlated with classification accuracy, is maximized. The methods comprise tuning statistical language model parameters jointly for all classes such that a classifier discriminates between the correct class and the incorrect ones for a given training sentence or utterance. Specific embodiments of the present invention pertain to implementation of the rational function growth transform in the context of a discriminative training technique for n-gram classifiers.

    摘要翻译: 公开了用于估计语言模型的方法,使得给定字串的类的条件似然性与分类准确度非常良好地相关联。 这些方法包括对所有类共同调整统计语言模型参数,使得分类器在给定训练句或话语中区分正确类和不正确类之间的差异。 本发明的具体实施例涉及在n-gram分类器的鉴别训练技术的上下文中实现有理函数增长变换。

    Discriminative training of language models for text and speech classification
    4.
    发明授权
    Discriminative training of language models for text and speech classification 有权
    文本和语言分类语言模型的歧视性训练

    公开(公告)号:US07379867B2

    公开(公告)日:2008-05-27

    申请号:US10453349

    申请日:2003-06-03

    IPC分类号: G06F17/27 G10L15/00

    摘要: Methods are disclosed for estimating language models such that the conditional likelihood of a class given a word string, which is very well correlated with classification accuracy, is maximized. The methods comprise tuning statistical language model parameters jointly for all classes such that a classifier discriminates between the correct class and the incorrect ones for a given training sentence or utterance. Specific embodiments of the present invention pertain to implementation of the rational function growth transform in the context of a discriminative training technique for n-gram classifiers.

    摘要翻译: 公开了用于估计语言模型的方法,使得给定字串的类的条件似然性与分类准确度非常良好地相关联。 这些方法包括对所有类共同调整统计语言模型参数,使得分类器在给定训练句或话语中区分正确类和不正确类之间的差异。 本发明的具体实施例涉及在n-gram分类器的鉴别训练技术的上下文中实现有理函数增长变换。

    Conditional maximum likelihood estimation of naïve bayes probability models
    9.
    发明授权
    Conditional maximum likelihood estimation of naïve bayes probability models 有权
    初始贝叶斯概率模型的条件最大似然估计

    公开(公告)号:US07624006B2

    公开(公告)日:2009-11-24

    申请号:US10941399

    申请日:2004-09-15

    IPC分类号: G06F17/27 G06F17/20 G06F17/30

    摘要: A statistical classifier is constructed by estimating Naïve Bayes classifiers such that the conditional likelihood of class given word sequence is maximized. The classifier is constructed using a rational function growth transform implemented for Naïve Bayes classifiers. The estimation method tunes the model parameters jointly for all classes such that the classifier discriminates between the correct class and the incorrect ones for a given training sentence or utterance. Optional parameter smoothing and/or convergence speedup can be used to improve model performance. The classifier can be integrated into a speech utterance classification system or other natural language processing system.

    摘要翻译: 通过估计朴素贝叶斯分类器来构建统计分类器,使得给定字序列的条件似然性最大化。 分类器是使用为朴素贝叶斯分类器实现的理性函数增长变换构建的。 估计方法为所有类别共同调整模型参数,以便分类器对于给定的训练句或话语来区分正确的类和不正确的类。 可选参数平滑和/或收敛加速可用于提高模型性能。 分类器可以集成到语音语音分类系统或其他自然语言处理系统中。

    DISCRIMINATIVE TRAINING OF LANGUAGE MODELS FOR TEXT AND SPEECH CLASSIFICATION
    10.
    发明申请
    DISCRIMINATIVE TRAINING OF LANGUAGE MODELS FOR TEXT AND SPEECH CLASSIFICATION 有权
    用于文本和语音分类的语言模式的歧视性培训

    公开(公告)号:US20080215311A1

    公开(公告)日:2008-09-04

    申请号:US12103035

    申请日:2008-04-15

    IPC分类号: G06F17/27

    摘要: Methods are disclosed for estimating language models such that the conditional likelihood of a class given a word string, which is very well correlated with classification accuracy, is maximized. The methods comprise tuning statistical language model parameters jointly for all classes such that a classifier discriminates between the correct class and the incorrect ones for a given training sentence or utterance. Specific embodiments of the present invention pertain to implementation of the rational function growth transform in the context of a discriminative training technique for n-gram classifiers.

    摘要翻译: 公开了用于估计语言模型的方法,使得给定字串的类的条件似然性与分类准确度非常良好地相关联。 这些方法包括对所有类共同调整统计语言模型参数,使得分类器在给定训练句或话语中区分正确类和不正确类之间的差异。 本发明的具体实施例涉及在n-gram分类器的鉴别训练技术的上下文中实现有理函数增长变换。