专利检索 ap:("Jianfeng Gao" OR "Kai-Fu Lee" OR "Mingjing Li" OR "Hai-Feng Wang" OR "Dong-Feng Cai" OR "Lee-Feng Chien") AND inv:"Mingjing Li" 第 1 页

1.

发明授权
System and method for joint optimization of language model performance and size 有权
标题翻译：联合优化语言模型性能和尺寸的系统和方法

公开(公告)号：US07275029B1

公开(公告)日：2007-09-25

申请号：US09607786

申请日：2000-06-30

申请人： Jianfeng Gao , Kai-Fu Lee , Mingjing Li , Hai-Feng Wang , Dong-Feng Cai , Lee-Feng Chien

发明人： Jianfeng Gao , Kai-Fu Lee , Mingjing Li , Hai-Feng Wang , Dong-Feng Cai , Lee-Feng Chien

IPC分类号： G06F17/27

CPC分类号： G06F17/2735 , G06F17/274 , G06F17/2818

摘要： A method for the joint optimization of language model performance and size is presented comprising developing a language model from a tuning set of information, segmenting at least a subset of a received textual corpus and calculating a perplexity value for each segment and refining the language model with one or more segments of the received corpus based, at least in part, on the calculated perplexity value for the one or more segments.

摘要翻译： 提出了一种用于联合优化语言模型性能和大小的方法，包括从调整的信息集开发语言模型，分割所接收的文本语料库的至少一个子集，并计算每个分段的困惑度值，并用至少部分地基于所计算的一个或多个段的困惑度值，所接收的语料库的一个或多个段。

2.

发明授权
Method and apparatus for adapting a class entity dictionary used with language models 有权

公开(公告)号：US07124080B2

公开(公告)日：2006-10-17

申请号：US10008432

申请日：2001-11-13

申请人： Zheng Chen , Jianfeng Gao , Mingjing Li , Feng Zhang

发明人： Zheng Chen , Jianfeng Gao , Mingjing Li , Feng Zhang

IPC分类号： G10L15/06 , G10L15/00

CPC分类号： G06F17/2715 , G06F17/2775

摘要： A method and apparatus are provided for augmenting a language model with a class entity dictionary based on corrections made by a user. Under the method and apparatus, a user corrects an output that is based in part on the language model by replacing an output segment with a correct segment. The correct segment is added to a class of segments in the class entity dictionary and a probability of the correct segment given the class is estimated based on an n-gram probability associated with the output segment and an n-gram probability associated with the class. This estimated probability is then used to generate further outputs.

3.

发明授权
Method and apparatus for distribution-based language model adaptation 有权
标题翻译：基于分布式语言模型适应的方法和装置

公开(公告)号：US07254529B2

公开(公告)日：2007-08-07

申请号：US11225543

申请日：2005-09-13

申请人： Jianfeng Gao , Mingjing Li

发明人： Jianfeng Gao , Mingjing Li

IPC分类号： G06F17/27 , G06F17/28 , G10L15/00

CPC分类号： G06F17/2715 , G10L15/065 , G10L15/1815

摘要： A method and apparatus are provided for adapting a language model to a task-specific domain. Under the method and apparatus, the relative frequency of n-grams in a small training set (i.e. task-specific training data set) and the relative frequency of n-grams in a large training set (i.e. out-of-domain training data set) are used to weight a distribution count of n-grams in the large training set. The weighted distributions are then used to form a modified language model by identifying probabilities for n-grams from the weighted distributions.

摘要翻译： 提供了一种用于使语言模型适应于任务特定领域的方法和装置。在该方法和装置下，小训练集中的n-gram的相对频率（即任务特定的训练数据集）和大训练集中的n-gram的相对频率（即，域外训练数据集）用于在大训练集中加权n-g的分布计数。然后通过从加权分布中识别n克的概率，将加权分布用于形成修改后的语言模型。

4.

发明授权
Language input system for mobile devices 有权
标题翻译：移动设备语言输入系统

公开(公告)号：US07277732B2

公开(公告)日：2007-10-02

申请号：US09843358

申请日：2001-04-24

申请人： Zheng Chen , Mingjing Li , Feng Zhang , Rui Yang , Jianfeng Gao

发明人： Zheng Chen , Mingjing Li , Feng Zhang , Rui Yang , Jianfeng Gao

IPC分类号： A04B1/38

CPC分类号： G06F3/0236 , G06F3/018 , G06F3/0237 , H04M1/72519 , H04M2250/58 , H04M2250/70

摘要： A language system facilitates entry of an input string into a mobile device using discrete keys on a keypad, such as a 10-key keypad. The numeric keys have associated letters of an alphabet. The key input is representative of one or more Chinese phonetic characters. Based on this input string, the language system derives the most likely Chinese corresponding language characters intended by the user. The language system uses multiple different search engines and language models to aid in deriving the most probable Chinese language characters. When the language system recognizes possible Chinese language characters, the mobile device displays the possible Chinese language characters for user selection of the possible Chinese language characters and/or further input of one or more Chinese phonetic characters. In this manner, the language system adopts a modeless entry methodology that eliminates conventional mode switching between input and selection operations.

摘要翻译： 语言系统有助于使用键盘上的离散键（诸如10键键盘）将输入串输入到移动设备中。数字键具有字母的相关字母。关键输入是一个或多个汉语拼音字符的代表。基于该输入字符串，语言系统导出用户想要的最可能的中文对应语言字符。语言系统使用多种不同的搜索引擎和语言模型来帮助推导出最可能的中文字符。当语言系统识别可能的中文字符时，移动设备显示可能的汉语字符，用于选择可能的中文字符和/或进一步输入一个或多个汉语拼音字符。以这种方式，语言系统采用无模式输入方法，消除了输入和选择操作之间的常规模式切换。

5.

发明授权
Method and apparatus for distribution-based language model adaptation 失效

公开(公告)号：US07043422B2

公开(公告)日：2006-05-09

申请号：US09945930

申请日：2001-09-04

申请人： Jianfeng Gao , Mingjing Li

发明人： Jianfeng Gao , Mingjing Li

IPC分类号： G06F17/27

CPC分类号： G06F17/2715 , G10L15/065 , G10L15/1815

摘要： A method and apparatus are provided for adapting a language model to a task-specific domain. Under the method and apparatus, the relative frequency of n-grams in a small training set (i.e. task-specific training data set) and the relative frequency of n-grams in a large training set (i.e. out-of-domain training data set) are used to weight a distribution count of n-grams in the large training set. The weighted distributions are then used to form a modified language model by identifying probabilities for n-grams from the weighted distributions.

6.

发明申请
Method and apparatus for distribution-based language model adaptation 有权

公开(公告)号：US20060009965A1

公开(公告)日：2006-01-12

申请号：US11225543

申请日：2005-09-13

申请人： Jianfeng Gao , Mingjing Li

发明人： Jianfeng Gao , Mingjing Li

IPC分类号： G06F17/27

CPC分类号： G06F17/2715 , G10L15/065 , G10L15/1815

摘要： A method and apparatus are provided for adapting a language model to a task-specific domain. Under the method and apparatus, the relative frequency of n-grams in a small training set (i.e. task-specific training data set) and the relative frequency of n-grams in a large training set (i.e. out-of-domain training data set) are used to weight a distribution count of n-grams in the large training set. The weighted distributions are then used to form a modified language model by identifying probabilities for n-grams from the weighted distributions.

7.

发明授权
Statistical approach to large-scale image annotation 有权
标题翻译：大规模图像注释的统计方法

公开(公告)号：US08594468B2

公开(公告)日：2013-11-26

申请号：US13406804

申请日：2012-02-28

申请人： Mingjing Li , Xiaoguang Rui

发明人： Mingjing Li , Xiaoguang Rui

IPC分类号： G06K9/60

CPC分类号： G06K9/00684 , G06K2209/27

摘要： Statistical approaches to large-scale image annotation are described. Generally, the annotation technique includes compiling visual features and textual information from a number of images, hashing the images visual features, and clustering the images based on their hash values. An example system builds statistical language models from the clustered images and annotates the image by applying one of the statistical language models.

摘要翻译： 描述了大规模图像注释的统计方法。通常，注释技术包括从许多图像编译视觉特征和文本信息，对图像进行散列视觉特征，并且基于它们的散列值对图像进行聚类。示例系统从群集图像构建统计语言模型，并通过应用统计语言模型之一来注释图像。

8.

发明授权
Estimating word correlations from images 有权
标题翻译：从图像估计字相关性

公开(公告)号：US08457416B2

公开(公告)日：2013-06-04

申请号：US11956333

申请日：2007-12-13

申请人： Jing Liu , Bin Wang , Zhiwei Li , Mingjing Li , Wei-Ying Ma

发明人： Jing Liu , Bin Wang , Zhiwei Li , Mingjing Li , Wei-Ying Ma

IPC分类号： G06K9/72

CPC分类号： G06F17/30247 , G06F17/30731

摘要： Word correlations are estimated using a content-based method, which uses visual features of image representations of the words. The image representations of the subject words may be generated by retrieving images from data sources (such as the Internet) using image search with the subject words as query words. One aspect of the techniques is based on calculating the visual distance or visual similarity between the sets of retrieved images corresponding to each query word. The other is based on calculating the visual consistence among the set of the retrieved images corresponding to a conjunctive query word. The combination of the content-based method and a text-based method may produce even better result.

摘要翻译： 使用基于内容的方法来估计词相关性，其使用词的图像表示的视觉特征。可以通过使用将主题词作为查询词的图像搜索从数据源（例如因特网）检索图像来生成主题词的图像表示。该技术的一个方面是基于计算对应于每个查询词的检索图像组之间的视觉距离或视觉相似度。另一个是基于计算与连接查询词对应的检索到的图像的集合之间的视觉一致性。基于内容的方法和基于文本的方法的组合可以产生更好的结果。

9.

发明授权
Bipartite graph reinforcement modeling to annotate web images 有权
标题翻译：双边图加强建模以注释网页图像

公开(公告)号：US08321424B2

公开(公告)日：2012-11-27

申请号：US11848157

申请日：2007-08-30

申请人： Mingjing Li , Wei-Ying Ma , Zhiwei Li , Xiaoguang Rui

发明人： Mingjing Li , Wei-Ying Ma , Zhiwei Li , Xiaoguang Rui

IPC分类号： G06F7/00 , G06F17/30

CPC分类号： G06F17/30265 , G06F17/30864

摘要： Systems and methods for bipartite graph reinforcement modeling to annotate web images are described. In one aspect the systems and methods implement bipartite graph reinforcement modeling operations to identify a set of annotations that are relevant to a Web image. The systems and methods annotate the Web image with the identified annotations. The systems and methods then index the annotated Web image. Responsive to receiving an image search query from a user, wherein the image search query comprises information relevant to at least a subset of the identified annotations, the image search engine service presents the annotated Web image to the user.

摘要翻译： 描述了用于注释网络图像的二分图加强建模的系统和方法。在一个方面，系统和方法实现二分图加强建模操作，以识别与Web图像相关的一组注释。系统和方法用已识别的注释注释Web图像。系统和方法然后索引注释的Web图像。响应于从用户接收图像搜索查询，其中所述图像搜索查询包括与所识别的注释的至少一个子集相关的信息，所述图像搜索引擎服务将所述注释的Web图像呈现给所述用户。

10.

发明申请
CLASSIFICATION OF IMAGES AS ADVERTISEMENT IMAGES OR NON-ADVERTISEMENT IMAGES 有权
标题翻译：图像分类作为广告图像或非广告图像

公开(公告)号：US20110058734A1

公开(公告)日：2011-03-10

申请号：US12945635

申请日：2010-11-12

申请人： Mingjing Li , Zhiwei Li , Dongfang Li , Bin Wang

发明人： Mingjing Li , Zhiwei Li , Dongfang Li , Bin Wang

IPC分类号： G06K9/62

CPC分类号： G06Q30/02 , G06Q30/0277

摘要： An advertisement image classification system trains a binary classifier to classify images as advertisement images or non-advertisement images and then uses the binary classifier to classify images of web pages as advertisement images or non-advertisement images. During a training phase, the classification system generates training data of feature vectors representing the images and labels indicating whether an image is an advertisement image or a non-advertisement Image. The classification system trains a binary classifier to classify Images using training data. During a classification phase, the classification system inputs a web page with an image and generates a feature vector for the image. The classification system then applies the trained binary classifier to the feature vector to generate a score indicating whether the image is an advertisement image or a non-advertisement image.

摘要翻译： 广告图像分类系统训练二进制分类器将图像分类为广告图像或非广告图像，然后使用二进制分类器将网页的图像分类为广告图像或非广告图像。在训练阶段，分类系统生成表示图像的特征向量的训练数据，以及指示图像是广告图像还是非广告图像的标签。分类系统训练二进制分类器，以使用训练数据对图像进行分类。在分类阶段，分类系统输入具有图像的网页，并生成图像的特征向量。然后，分类系统将经过训练的二进制分类器应用于特征向量，以生成指示图像是广告图像还是非广告图像的分数。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类