专利检索 ap:("Xiaodong He" OR "Jianfeng Gao" OR "Jennifer Gillenwater") AND inv:"Jianfeng Gao" 第 8 页

71.

发明申请
Collocation translation from monolingual and available bilingual corpora 审中-公开
标题翻译：单语和双语语料库的翻译

公开(公告)号：US20060282255A1

公开(公告)日：2006-12-14

申请号：US11152540

申请日：2005-06-14

申请人： Yajuan Lu , Jianfeng Gao , Ming Zhou , John Chen , Mu Li

发明人： Yajuan Lu , Jianfeng Gao , Ming Zhou , John Chen , Mu Li

IPC分类号： G06F17/28

CPC分类号： G06F17/2827

摘要： A system and method of extracting collocation translations is presented. The methods include constructing a collocation translation model using monolingual source and target language corpora as well as bilingual corpus, if available. The collocation translation model employs an expectation maximization algorithm with respect to contextual words surrounding collocations. The collocation translation model can be used later to extract a collocation translation dictionary. Optional filters based on context redundancy and/or bi-directional translation constrain can be used to ensure that only highly reliable collocation translations are included in the dictionary. The constructed collocation translation model and the extracted collocation translation dictionary can be used later for further natural language processing, such as sentence translation.

摘要翻译： 提出了一种提取搭配翻译的系统和方法。这些方法包括使用单语源语言和目标语言语料库以及双语语料库（如果可用）来构建搭配翻译模型。搭配翻译模型采用围绕搭配的上下文单词的期望最大化算法。搭配翻译模型可以随后用于提取搭配翻译字典。可以使用基于上下文冗余和/或双向转换约束的可选过滤器来确保字典中仅包含高度可靠的并置转换。构建的搭配翻译模型和提取的搭配翻译词典可以稍后用于进一步的自然语言处理，如句子翻译。

72.

发明申请
Discriminative training for language modeling 有权
标题翻译：语言建模歧视性培训

公开(公告)号：US20060277033A1

公开(公告)日：2006-12-07

申请号：US11142432

申请日：2005-06-01

申请人： Jianfeng Gao , Hisami Suzuki

发明人： Jianfeng Gao , Hisami Suzuki

IPC分类号： G06F17/21

CPC分类号： G10L15/063 , G10L15/197

摘要： A method of training language model parameters trains discriminative model parameters in the language model based on a performance measure having discrete values.

摘要翻译： 训练语言模型参数的方法是基于具有离散值的性能度量来训练语言模型中的歧视模型参数。

73.

发明申请
Method and Apparatus for Generating and Managing a Language Model Data Structure 失效
标题翻译：用于生成和管理语言模型数据结构的方法和装置

公开(公告)号：US20060184341A1

公开(公告)日：2006-08-17

申请号：US11276292

申请日：2006-02-22

申请人： Shuo Di , Kai-Fu Lee , Lee-Feng Chien , Zheng Chen , Jianfeng Gao

发明人： Shuo Di , Kai-Fu Lee , Lee-Feng Chien , Zheng Chen , Jianfeng Gao

IPC分类号： G06F17/10

CPC分类号： G06F17/27 , G10L15/285

摘要： A method is presented comprising assigning each of a plurality of segments comprising a received corpus to a node in a data structure denoting dependencies between nodes, and calculating a transitional probability between each of the nodes in the data structure.

摘要翻译： 提出了一种方法，包括将包括接收到的语料库的多个段中的每一个分配给表示节点之间的依赖关系的数据结构中的节点，以及计算数据结构中每个节点之间的过渡概率。

74.

发明申请
Method and apparatus for distribution-based language model adaptation 有权

公开(公告)号：US20060009965A1

公开(公告)日：2006-01-12

申请号：US11225543

申请日：2005-09-13

申请人： Jianfeng Gao , Mingjing Li

发明人： Jianfeng Gao , Mingjing Li

IPC分类号： G06F17/27

CPC分类号： G06F17/2715 , G10L15/065 , G10L15/1815

摘要： A method and apparatus are provided for adapting a language model to a task-specific domain. Under the method and apparatus, the relative frequency of n-grams in a small training set (i.e. task-specific training data set) and the relative frequency of n-grams in a large training set (i.e. out-of-domain training data set) are used to weight a distribution count of n-grams in the large training set. The weighted distributions are then used to form a modified language model by identifying probabilities for n-grams from the weighted distributions.

75.

发明授权
System and iterative method for lexicon, segmentation and language model joint optimization 有权
标题翻译：词法，分割和语言模型联合优化的系统迭代法

公开(公告)号：US06904402B1

公开(公告)日：2005-06-07

申请号：US09609202

申请日：2000-06-30

申请人： Hai-Feng Wang , Chang-Ning Huang , Kai-Fu Lee , Shuo Di , Jianfeng Gao , Dong-Feng Cai , Lee-Feng Chien

发明人： Hai-Feng Wang , Chang-Ning Huang , Kai-Fu Lee , Shuo Di , Jianfeng Gao , Dong-Feng Cai , Lee-Feng Chien

IPC分类号： G06F17/28 , G06F17/27 , G10L15/06 , G10L15/18 , G06F17/21 , G06F17/20 , G10L15/00

CPC分类号： G06F17/274 , G10L15/197

摘要： A method for optimizing a language model is presented comprising developing an initial language model from a lexicon and segmentation derived from a received corpus using a maximum match technique, and iteratively refining the initial language model by dynamically updating the lexicon and re-segmenting the corpus according to statistical principles until a threshold of predictive capability is achieved.

摘要翻译： 提出了一种用于优化语言模型的方法，其包括使用最大匹配技术从词典和从接收到的语料库导出的分割开发初始语言模型，并且通过动态地更新词典并重新分割语料库来迭代地改进初始语言模型统计原理，直到达到预测能力的阈值。

76.

发明授权
Search query and document-related data translation 有权
标题翻译：搜索查询和文档相关的数据翻译

公开(公告)号：US09501759B2

公开(公告)日：2016-11-22

申请号：US13328924

申请日：2011-12-16

申请人： Jianfeng Gao , Xuedong Huang , Mei Li , Zhenghao Wang , Christopher John Brockett , William B. Dolan

发明人： Jianfeng Gao , Xuedong Huang , Mei Li , Zhenghao Wang , Christopher John Brockett , William B. Dolan

IPC分类号： G06F17/30 , G06F7/00 , G06Q10/10 , G06Q30/02

CPC分类号： G06Q10/10 , G06F17/30687 , G06Q30/0241

摘要： The subject disclosure is directed towards developing a translation model for mapping search query terms to document-related data. By processing user logs comprising search histories into word-aligned query-document pairs, the translation model may be trained using data, such as probabilities, corresponding to the word-aligned query-document pairs. After incorporating the translation model into model data for a search engine, the translation model is used may used as features for producing relevance scores for current search queries and ranking documents/advertisements according to relevance.

摘要翻译： 本发明旨在开发用于将搜索查询词语映射到文档相关数据的翻译模型。通过将包括搜索历史的用户日志处理成字对齐的查询 - 文档对，可以使用对应于字对齐的查询 - 文档对的诸如概率的数据来训练翻译模型。在将翻译模型合并到搜索引擎的模型数据中之后，使用翻译模型可以用作根据相关性产生当前搜索查询和排序文档/广告的相关性分数的特征。

77.

发明授权
Universal text input 有权
标题翻译：通用文本输入

公开(公告)号：US08738356B2

公开(公告)日：2014-05-27

申请号：US13110484

申请日：2011-05-18

申请人： Hisami Suzuki , Vikram Dendi , Christopher Brian Quirk , Pallavi Choudhury , Jianfeng Gao , Achraf Chalabi

发明人： Hisami Suzuki , Vikram Dendi , Christopher Brian Quirk , Pallavi Choudhury , Jianfeng Gao , Achraf Chalabi

IPC分类号： G06F17/28

CPC分类号： G06F17/27

摘要： The universal text input technique described herein addresses the difficulties of typing text in various languages and scripts, and offers a unified solution, which combines character conversion, next word prediction, spelling correction and automatic script switching to make it extremely simple to type any language from any device. The technique provides a rich and seamless input experience in any language through a universal IME (input method editor). It allows a user to type in any script for any language using a regular qwerty keyboard via phonetic input and at the same time allows for auto-completion and spelling correction of words and phrases while typing. The technique also provides a modeless input that automatically turns on and off an input mode that changes between different types of script.

摘要翻译： 本文描述的通用文本输入技术解决了以各种语言和脚本输入文本的困难，并提供了一种统一的解决方案，它将字符转换，下一个字预测，拼写校正和自动脚本切换相结合，使其非常简单，任何设备。该技术通过通用IME（输入法编辑器）为任何语言提供了丰富且无缝的输入体验。它允许用户使用普通qwerty键盘通过语音输入为任何语言输入任何脚本，同时允许在打字时自动完成和拼写校正单词和短语。该技术还提供了无模式输入，可自动打开和关闭在不同类型脚本之间进行更改的输入模式。

78.

发明申请
Search Query and Document-Related Data Translation 有权
标题翻译：搜索查询和文档相关数据翻译

公开(公告)号：US20130103493A1

公开(公告)日：2013-04-25

申请号：US13328924

申请日：2011-12-16

申请人： Jianfeng Gao , Xuedong Huang , Mei Li , Zhenghao Wang , Christopher John Brockett , William B. Dolan

发明人： Jianfeng Gao , Xuedong Huang , Mei Li , Zhenghao Wang , Christopher John Brockett , William B. Dolan

IPC分类号： G06Q30/02 , G06F17/30

CPC分类号： G06Q10/10 , G06F17/30687 , G06Q30/0241

摘要： The subject disclosure is directed towards developing a translation model for mapping search query terms to document-related data. By processing user logs comprising search histories into word-aligned query-document pairs, the translation model may be trained using data, such as probabilities, corresponding to the word-aligned query-document pairs. After incorporating the translation model into model data for a search engine, the translation model is used may used as features for producing relevance scores for current search queries and ranking documents/advertisements according to relevance.

摘要翻译： 本发明旨在开发用于将搜索查询词语映射到文档相关数据的翻译模型。通过将包括搜索历史的用户日志处理成字对齐的查询 - 文档对，可以使用对应于字对齐的查询 - 文档对的诸如概率的数据来训练翻译模型。在将翻译模型合并到搜索引擎的模型数据中之后，使用翻译模型可以用作根据相关性产生当前搜索查询和排序文档/广告的相关性分数的特征。

79.

发明授权
Web-based proofing and usage guidance 有权
标题翻译：基于Web的打样和使用指南

公开(公告)号：US07991609B2

公开(公告)日：2011-08-02

申请号：US11713073

申请日：2007-02-28

申请人： Chris Brockett , William Dolan , Michael Gamon , Jianfeng Gao , Lucy Vanderwende , Hsiao-Wen Hon , Ming Zhou , Gary Kacmarcik , Alexandre Klementiev

发明人： Chris Brockett , William Dolan , Michael Gamon , Jianfeng Gao , Lucy Vanderwende , Hsiao-Wen Hon , Ming Zhou , Gary Kacmarcik , Alexandre Klementiev

IPC分类号： G06F17/27

CPC分类号： G06F17/274 , G06F17/273

摘要： A system is disclosed for checking grammar and usage using a flexible portfolio of different mechanisms, and automatically providing a variety of different examples of standard usage, selected from analogous Web content. The system can be used for checking the grammar and usage in any application that involves natural language text, such as word processing, email, and presentation applications. The grammar and usage can be evaluated using several complementary evaluation modules, which may include one based on a trained classifier, one based on regular expressions, and one based on comparative searches of the Web or a local corpus. The evaluation modules can provide a set of suggested alternative segments with corrected grammar and usage. A followup, screened Web search based on the alternative segments, in context, may provide several different in-context examples of proper grammar and usage that the user can consider and select from.

摘要翻译： 公开了一种用于使用不同机制的灵活组合来检查语法和使用的系统，并且自动提供从类似的Web内容中选择的各种不同的标准用法示例。该系统可用于检查涉及自然语言文本（例如文字处理，电子邮件和演示应用程序）的任何应用程序中的语法和用法。可以使用几个补充评估模块来评估语法和用法，这些模块可以包括基于经过训练的分类器，基于正则表达式的分类器，以及基于Web或本地语料库的比较搜索的评估模块。评估模块可以提供一组具有校正语法和用法的建议替代段。在上下文中，基于替代段的后续筛选的Web搜索可以提供用户可以考虑和选择的适当的语法和使用的几个不同的上下文示例。

80.

发明授权
Ranker selection for statistical natural language processing 有权
标题翻译：统计自然语言处理的Ranker选择

公开(公告)号：US07844555B2

公开(公告)日：2010-11-30

申请号：US11938811

申请日：2007-11-13

申请人： Jianfeng Gao , Galen Andrew , Mark Johnson , Kristina Toutanova

发明人： Jianfeng Gao , Galen Andrew , Mark Johnson , Kristina Toutanova

IPC分类号： G06F15/18 , G06E1/00 , G06E3/00 , G06G7/00 , G06N3/02

CPC分类号： G06F17/2715

摘要： Systems and methods for selecting a ranker for statistical natural language processing are provided. One disclosed system includes a computer program configured to be executed on a computing device, the computer program comprising a data store including reference performance data for a plurality of candidate rankers, the reference performance data being calculated based on a processing of test data by each of the plurality of candidate rankers. The system may further include a ranker selector configured to receive a statistical natural language processing task and a performance target, and determine a selected ranker from the plurality of candidate rankers based on the statistical natural language processing task, the performance target, and the reference performance data.

摘要翻译： 提供了用于选择用于统计自然语言处理的游戏者的系统和方法。一种公开的系统包括被配置为在计算设备上执行的计算机程序，该计算机程序包括数据存储器，该数据存储器包括用于多个候选排名者的参考演出数据，该参考演出数据是基于每个测试数据的处理来计算的多个候选排名。该系统可以进一步包括配置成接收统计自然语言处理任务和性能目标的排队选择器，并且基于统计自然语言处理任务，性能目标和参考性能来确定来自多个候选排名者的选定队员数据。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类