专利检索 ap:("Ciprian Chelba" OR "Thorsten Brants") AND inv:"Ciprian Chelba" 第 1 页

1.

发明授权
Representing n-gram language models for compact storage and fast retrieval 有权
标题翻译：代表用于紧凑存储和快速检索的n-gram语言模型

公开(公告)号：US08175878B1

公开(公告)日：2012-05-08

申请号：US12968108

申请日：2010-12-14

申请人： Ciprian Chelba , Thorsten Brants

发明人： Ciprian Chelba , Thorsten Brants

IPC分类号： G10L15/18 , G10L15/06 , G06F17/27

CPC分类号： G06F17/2715 , G06K9/723 , G06K2209/01 , G10L15/197

摘要： Systems, methods, and apparatuses, including computer program products, are provided for representing language models. In some implementations, a computer-implemented method is provided. The method includes generating a compact language model including receiving a collection of n-grams from the corpus, each n-gram of the collection having a corresponding first probability of occurring in the corpus and generating a trie representing the collection of n-grams. The method also includes using the language model to identify a second probability of a particular string of words occurring.

摘要翻译： 提供了用于表示语言模型的系统，方法和装置，包括计算机程序产品。在一些实现中，提供了计算机实现的方法。该方法包括生成紧凑语言模型，包括从语料库接收n-gram的集合，每个n-gram的集合具有在语料库中发生的对应的第一概率，并且生成代表n-gram的集合的特里。该方法还包括使用语言模型来识别发生的特定字符串字符串的第二概率。

2.

发明授权
Representing n-gram language models for compact storage and fast retrieval 有权
标题翻译：代表用于紧凑存储和快速检索的n-gram语言模型

公开(公告)号：US07877258B1

公开(公告)日：2011-01-25

申请号：US11693613

申请日：2007-03-29

申请人： Ciprian Chelba , Thorsten Brants

发明人： Ciprian Chelba , Thorsten Brants

IPC分类号： G10L15/18 , G10L15/06 , G06F17/27

CPC分类号： G06F17/2715 , G06K9/723 , G06K2209/01 , G10L15/197

摘要： Systems, methods, and apparatuses, including computer program products, are provided for representing language models. In some implementations, a computer-implemented method is provided. The method includes generating a compact language model including receiving a collection of n-grams from the corpus, each n-gram of the collection having a corresponding first probability of occurring in the corpus and generating a trie representing the collection of n-grams. The method also includes using the language model to identify a second probability of a particular string of words occurring.

摘要翻译： 提供了用于表示语言模型的系统，方法和装置，包括计算机程序产品。在一些实现中，提供了计算机实现的方法。该方法包括生成紧凑语言模型，包括从语料库接收n-gram的集合，每个n-gram的集合具有在语料库中发生的对应的第一概率，并且生成代表n-gram的集合的特里。该方法还包括使用语言模型来识别发生的特定字符串字符串的第二概率。

3.

发明授权
Discriminative training of language models for text and speech classification 有权
标题翻译：文本和语言分类语言模型的歧视性训练

公开(公告)号：US08306818B2

公开(公告)日：2012-11-06

申请号：US12103035

申请日：2008-04-15

申请人： Ciprian Chelba , Alejandro Acero , Milind Mahajan

发明人： Ciprian Chelba , Alejandro Acero , Milind Mahajan

IPC分类号： G10L15/00 , G06F17/27

CPC分类号： G06F17/2715 , G06F17/2818 , G10L15/183 , G10L15/197

摘要： Methods are disclosed for estimating language models such that the conditional likelihood of a class given a word string, which is very well correlated with classification accuracy, is maximized. The methods comprise tuning statistical language model parameters jointly for all classes such that a classifier discriminates between the correct class and the incorrect ones for a given training sentence or utterance. Specific embodiments of the present invention pertain to implementation of the rational function growth transform in the context of a discriminative training technique for n-gram classifiers.

摘要翻译： 公开了用于估计语言模型的方法，使得给定字串的类的条件似然性与分类准确度非常良好地相关联。这些方法包括对所有类共同调整统计语言模型参数，使得分类器在给定训练句或话语中区分正确类和不正确类之间的差异。本发明的具体实施例涉及在n-gram分类器的鉴别训练技术的上下文中实现有理函数增长变换。

4.

发明授权
Generic spelling mnemonics 失效
标题翻译：通用拼写助记符

公开(公告)号：US07765102B2

公开(公告)日：2010-07-27

申请号：US12171309

申请日：2008-07-11

申请人： David Mowatt , Robert Chambers , Ciprian Chelba , Qiang Wu

发明人： David Mowatt , Robert Chambers , Ciprian Chelba , Qiang Wu

IPC分类号： G10L15/00

CPC分类号： G10L15/183

摘要： A system and method for creating a mnemonics Language Model for use with a speech recognition software application, wherein the method includes generating an n-gram Language Model containing a predefined large body of characters, wherein the n-gram Language Model includes at least one character from the predefined large body of characters, constructing a new Language Model (LM) token for each of the at least one character, extracting pronunciations for each of the at least one character responsive to a predefined pronunciation dictionary to obtain a character pronunciation representation, creating at least one alternative pronunciation for each of the at least one character responsive to the character pronunciation representation to create an alternative pronunciation dictionary and compiling the n-gram Language Model for use with the speech recognition software application, wherein compiling the Language Model is responsive to the new Language Model token and the alternative pronunciation dictionary.

摘要翻译： 一种用于创建与语音识别软件应用一起使用的助记符语言模型的系统和方法，其中所述方法包括生成包含预定义的大量字符的n-gram语言模型，其中所述n-gram语言模型包括至少一个字符从所述预定义的大量字符中，为所述至少一个字符中的每一个构造新的语言模型（LM）令牌，响应于预定义的发音字典提取所述至少一个字符中的每个字符的发音，以获得字符发音表示，创建响应于字符发音表示的至少一个字符中的每一个的至少一个替代发音，以创建替代发音字典并且编译用于语音识别软件应用的n-gram语言模型，其中编译语言模型响应于新的语言模型标记和替代发音词典。

5.

发明授权
Discriminative training of language models for text and speech classification 有权
标题翻译：文本和语言分类语言模型的歧视性训练

公开(公告)号：US07379867B2

公开(公告)日：2008-05-27

申请号：US10453349

申请日：2003-06-03

申请人： Ciprian Chelba , Alejandro Acero , Milind Mahajan

发明人： Ciprian Chelba , Alejandro Acero , Milind Mahajan

IPC分类号： G06F17/27 , G10L15/00

CPC分类号： G06F17/2715 , G06F17/2818 , G10L15/183 , G10L15/197

摘要： Methods are disclosed for estimating language models such that the conditional likelihood of a class given a word string, which is very well correlated with classification accuracy, is maximized. The methods comprise tuning statistical language model parameters jointly for all classes such that a classifier discriminates between the correct class and the incorrect ones for a given training sentence or utterance. Specific embodiments of the present invention pertain to implementation of the rational function growth transform in the context of a discriminative training technique for n-gram classifiers.

摘要翻译： 公开了用于估计语言模型的方法，使得给定字串的类的条件似然性与分类准确度非常良好地相关联。这些方法包括对所有类共同调整统计语言模型参数，使得分类器在给定训练句或话语中区分正确类和不正确类之间的差异。本发明的具体实施例涉及在n-gram分类器的鉴别训练技术的上下文中实现有理函数增长变换。

6.

发明申请
Representation of a deleted interpolation N-gram language model in ARPA standard format 有权
标题翻译：以ARPA标准格式表示删除的插值N-gram语言模型

公开(公告)号：US20050216265A1

公开(公告)日：2005-09-29

申请号：US10810254

申请日：2004-03-26

申请人： Ciprian Chelba , Milind Mahajan , Alejandro Acero

发明人： Ciprian Chelba , Milind Mahajan , Alejandro Acero

IPC分类号： G06F17/27 , G06F17/28 , G10L15/00 , G10L15/18

CPC分类号： G06F17/277 , G10L15/197

摘要： A method and apparatus are provided for storing parameters of a deleted interpolation language model as parameters of a backoff language model. In particular, the parameters of the deleted interpolation language model are stored in the standard ARPA format. Under one embodiment, the deleted interpolation language model parameters are formed using fractional counts.

摘要翻译： 提供一种方法和装置，用于存储被删除的插值语言模型的参数作为退避语言模型的参数。特别地，删除的插值语言模型的参数以标准ARPA格式存储。在一个实施例中，使用分数计数形成删除的插值语言模型参数。

7.

发明授权
System for using statistical classifiers for spoken language understanding 有权
标题翻译：使用统计分类器进行口语理解的系统

公开(公告)号：US08335683B2

公开(公告)日：2012-12-18

申请号：US10350199

申请日：2003-01-23

申请人： Alejandro Acero , Ciprian Chelba , Ye-Yi Wang , Leon Wong , Brendan Frey

发明人： Alejandro Acero , Ciprian Chelba , Ye-Yi Wang , Leon Wong , Brendan Frey

IPC分类号： G06F17/27

CPC分类号： G06F17/2715

摘要： The present invention involves using one or more statistical classifiers in order to perform task classification on natural language inputs. In another embodiment, the statistical classifiers can be used in conjunction with a rule-based classifier to perform task classification.

摘要翻译： 本发明涉及使用一个或多个统计分类器来对自然语言输入执行任务分类。在另一个实施例中，统计分类器可以与基于规则的分类器一起使用以执行任务分类。

8.

发明授权
Conditional maximum likelihood estimation of naïve bayes probability models 有权
标题翻译：初始贝叶斯概率模型的条件最大似然估计

公开(公告)号：US07624006B2

公开(公告)日：2009-11-24

申请号：US10941399

申请日：2004-09-15

申请人： Ciprian Chelba , Alejandro Acero

发明人： Ciprian Chelba , Alejandro Acero

IPC分类号： G06F17/27 , G06F17/20 , G06F17/30

CPC分类号： G10L15/1822 , G06N7/005 , Y10S707/99936

摘要： A statistical classifier is constructed by estimating Naïve Bayes classifiers such that the conditional likelihood of class given word sequence is maximized. The classifier is constructed using a rational function growth transform implemented for Naïve Bayes classifiers. The estimation method tunes the model parameters jointly for all classes such that the classifier discriminates between the correct class and the incorrect ones for a given training sentence or utterance. Optional parameter smoothing and/or convergence speedup can be used to improve model performance. The classifier can be integrated into a speech utterance classification system or other natural language processing system.

摘要翻译： 通过估计朴素贝叶斯分类器来构建统计分类器，使得给定字序列的条件似然性最大化。分类器是使用为朴素贝叶斯分类器实现的理性函数增长变换构建的。估计方法为所有类别共同调整模型参数，以便分类器对于给定的训练句或话语来区分正确的类和不正确的类。可选参数平滑和/或收敛加速可用于提高模型性能。分类器可以集成到语音语音分类系统或其他自然语言处理系统中。

9.

发明申请
DISCRIMINATIVE TRAINING OF LANGUAGE MODELS FOR TEXT AND SPEECH CLASSIFICATION 有权
标题翻译：用于文本和语音分类的语言模式的歧视性培训

公开(公告)号：US20080215311A1

公开(公告)日：2008-09-04

申请号：US12103035

申请日：2008-04-15

申请人： Ciprian Chelba , Alejandro Acero , Milind Mahajan

发明人： Ciprian Chelba , Alejandro Acero , Milind Mahajan

IPC分类号： G06F17/27

CPC分类号： G06F17/2715 , G06F17/2818 , G10L15/183 , G10L15/197

摘要： Methods are disclosed for estimating language models such that the conditional likelihood of a class given a word string, which is very well correlated with classification accuracy, is maximized. The methods comprise tuning statistical language model parameters jointly for all classes such that a classifier discriminates between the correct class and the incorrect ones for a given training sentence or utterance. Specific embodiments of the present invention pertain to implementation of the rational function growth transform in the context of a discriminative training technique for n-gram classifiers.

摘要翻译： 公开了用于估计语言模型的方法，使得给定字串的类的条件似然性与分类准确度非常良好地相关联。这些方法包括对所有类共同调整统计语言模型参数，使得分类器在给定训练句或话语中区分正确类和不正确类之间的差异。本发明的具体实施例涉及在n-gram分类器的鉴别训练技术的上下文中实现有理函数增长变换。

10.

发明申请
Generic spelling mnemonics 失效
标题翻译：通用拼写助记符

公开(公告)号：US20060111907A1

公开(公告)日：2006-05-25

申请号：US10996732

申请日：2004-11-24

申请人： David Mowatt , Robert Chambers , Ciprian Chelba , Qiang Wu

发明人： David Mowatt , Robert Chambers , Ciprian Chelba , Qiang Wu

IPC分类号： G10L15/18

CPC分类号： G10L15/183

摘要： A system and method for creating a mnemonics Language Model for use with a speech recognition software application, wherein the method includes generating an n-gram Language Model containing a predefined large body of characters, wherein the n-gram Language Model includes at least one character from the predefined large body of characters, constructing a new language Model (LM) token for each of the at least one character, extracting pronunciations for each of the at least one character responsive to a predefined pronunciation dictionary to obtain a character pronunciation representation, creating at least one alternative pronunciation for each of the at least one character responsive to the character pronunciation representation to create an alternative pronunciation dictionary and compiling the n-gram Language Model for use with the speech recognition software application, wherein compiling the Language Model is responsive to the new Language Model token and the alternative pronunciation dictionary.

摘要翻译： 一种用于创建与语音识别软件应用一起使用的助记符语言模型的系统和方法，其中所述方法包括生成包含预定义的大量字符的n-gram语言模型，其中所述n-gram语言模型包括至少一个字符从所述预定义的大量字符中，为所述至少一个字符中的每一个构造新语言模型（LM）令牌，响应于预定义的发音字典提取所述至少一个字符中的每个字符的发音以获得字符发音表示，创建响应于字符发音表示的至少一个字符中的每一个的至少一个替代发音，以创建替代发音字典并且编译用于语音识别软件应用的n-gram语言模型，其中编译语言模型响应于新的语言模型标记和替代发音词典。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类