Training a ranking function using propagated document relevance
    31.
    发明授权
    Training a ranking function using propagated document relevance 有权
    使用传播的文档相关性来训练排名功能

    公开(公告)号:US08001121B2

    公开(公告)日:2011-08-16

    申请号:US11364576

    申请日:2006-02-27

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30657 G06F17/30864

    摘要: A method and system for propagating the relevance of labeled documents to a query to unlabeled documents is provided. The propagation system provides training data that includes queries, documents labeled with their relevance to the queries, and unlabeled documents. The propagation system then calculates the similarity between pairs of documents in the training data. The propagation system then propagates the relevance of the labeled documents to similar, but unlabeled, documents. The propagation system may iteratively propagate labels of the documents until the labels converge on a solution. The training data with the propagated relevances can then be used to train a ranking function.

    摘要翻译: 提供了一种用于将标记的文档的相关性传播到未标记文档的查询的方法和系统。 传播系统提供包括查询,标记为与查询相关的文档以及未标记的文档的培训数据。 传播系统然后计算训练数据中文档对之间的相似度。 传播系统然后将标记的文档的相关性传播到类似但未标记的文档。 传播系统可以迭代地传播文档的标签,直到标签收敛在解决方案上。 然后可以使用具有传播相关性的训练数据来训练排序功能。

    SCORING RELEVANCE OF A DOCUMENT BASED ON IMAGE TEXT
    32.
    发明申请
    SCORING RELEVANCE OF A DOCUMENT BASED ON IMAGE TEXT 有权
    根据图像文本对文档的相关性进行分类

    公开(公告)号:US20110087660A1

    公开(公告)日:2011-04-14

    申请号:US12972259

    申请日:2010-12-17

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864 G06F17/30265

    摘要: A method and system for determining relevance of a document having text and images to a text string is provided. A scoring system identifies image text associated with an image of the document. The scoring system calculates an image score indicating relevance of the image text to the text string. The image score may be used in many applications, such as searching, summary generation, and document classification, image search, and image classification.

    摘要翻译: 提供了一种用于确定具有文本和图像的文档与文本串的相关性的方法和系统。 评分系统识别与文档的图像相关联的图像文本。 评分系统计算指示图像文本与文本字符串的相关性的图像分数。 图像分数可以用于许多应用中,例如搜索,汇总生成和文档分类,图像搜索和图像分类。

    Content object indexing using domain knowledge
    33.
    发明授权
    Content object indexing using domain knowledge 有权
    使用领域知识的内容对象索引

    公开(公告)号:US07698294B2

    公开(公告)日:2010-04-13

    申请号:US11275509

    申请日:2006-01-11

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30613

    摘要: A content object indexing process including creating a content object knowledge index, calculating a description vector of a target content object, and indexing the target content object by searching for the description vector in the content object knowledge database. It may be difficult to search for an exact content object such as a music file or academic researcher as a conventional search index may not include related hierarchical information. A content object indexing process may add hierarchical information taken from a content object knowledge index and incorporate the hierarchical information to the index entry for a specific content object. An application of such a content object indexing process may be a world wide web search engine.

    摘要翻译: 内容对象索引处理包括创建内容对象知识索引,计算目标内容对象的描述向量,并通过搜索内容对象知识库中的描述向量来索引目标内容对象。 可能难以搜索诸如音乐文件或学术研究者的确切内容对象,因为传统的搜索索引可能不包括相关的分层信息。 内容对象索引处理可以添加从内容对象知识索引获取的分层信息,并且将分层信息并入特定内容对象的索引条目。 这样的内容对象索引处理的应用可以是万维网搜索引擎。

    Identifying important news reports from news home pages
    34.
    发明授权
    Identifying important news reports from news home pages 失效
    从新闻主页识别重要的新闻报道

    公开(公告)号:US07502789B2

    公开(公告)日:2009-03-10

    申请号:US11303443

    申请日:2005-12-15

    IPC分类号: G06F17/30

    摘要: A method and system for determining the importance of news events based on news reports published by news sources on news home pages is provided. A news system identifies news home pages that contain references to news reports on detail pages. The news system calculates the importance of a news event based on the importance of news reports reporting that news event. The news system may determine the importance of a news report based on the assumption that the importance of a news report is based on the credibility of news home pages and the importance of similar news reports. The news system may recursively define the importance of a news report based on the credibility of news home pages and the similarity to other news pages and the importance of a news home page based on the importance of its news reports and similar news reports.

    摘要翻译: 提供了一种基于新闻源在新闻主页上发布的新闻报道来确定新闻事件重要性的方法和系统。 新闻系统识别包含对详细页面上的新闻报道的引用的新闻主页。 新闻系统基于报道新闻事件的新闻报道的重要性来计算新闻事件的重要性。 新闻系统可以基于以下假设来确定新闻报道的重要性:新闻报道的重要性是基于新闻主页的可信度和类似新闻报道的重要性。 新闻系统可以基于新闻主页的可信度和与其他新闻页面的相似性以及基于新闻报道和类似新闻报道的重要性的新闻主页的重要性递归地定义新闻报道的重要性。

    Adding dominant media elements to search results
    35.
    发明授权
    Adding dominant media elements to search results 有权
    添加主要媒体元素以搜索结果

    公开(公告)号:US07433895B2

    公开(公告)日:2008-10-07

    申请号:US11166775

    申请日:2005-06-24

    IPC分类号: G06F17/30

    摘要: A method and system for determining dominance of the media elements of display pages is provided. The dominance system provides a scoring mechanism for scoring the dominance of media elements of display pages based on features of each media element of the display page. To generate the scores for the media elements of the display page, the dominance system first identifies the media elements and then identifies the features of the media elements. The dominance system then scores the identified media elements using the provided scoring mechanism and the identified features.

    摘要翻译: 提供了一种用于确定显示页面的媒体元素优势的方法和系统。 优势系统提供了一种评分机制,用于基于显示页面的每个媒体元素的特征来评分显示页面的媒体元素的优势。 为了生成显示页面的媒体元素的分数,优势系统首先识别媒体元素,然后识别媒体元素的特征。 优势系统然后使用提供的评分机制和识别的特征对所识别的媒体元素进行评分。

    SCORING RELEVANCE OF A DOCUMENT BASED ON IMAGE TEXT
    36.
    发明申请
    SCORING RELEVANCE OF A DOCUMENT BASED ON IMAGE TEXT 有权
    根据图像文本对文档的相关性进行分类

    公开(公告)号:US20080215561A1

    公开(公告)日:2008-09-04

    申请号:US11681161

    申请日:2007-03-01

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864 G06F17/30265

    摘要: A method and system for determining relevance of a document having text and images to a text string is provided. A scoring system identifies image text associated with an image of the document. The scoring system calculates an image score indicating relevance of the image text to the text string. The image score may be used in many applications, such as searching, summary generation, and document classification, image search, and image classification.

    摘要翻译: 提供了一种用于确定具有文本和图像的文档与文本串的相关性的方法和系统。 评分系统识别与文档的图像相关联的图像文本。 评分系统计算指示图像文本与文本字符串的相关性的图像分数。 图像分数可以用于许多应用中,例如搜索,汇总生成和文档分类,图像搜索和图像分类。

    Propagating relevance from labeled documents to unlabeled documents
    37.
    发明申请
    Propagating relevance from labeled documents to unlabeled documents 有权
    从标签文档到未标记的文档传播相关性

    公开(公告)号:US20070203940A1

    公开(公告)日:2007-08-30

    申请号:US11364807

    申请日:2006-02-27

    IPC分类号: G06F17/00

    CPC分类号: G06F17/30864

    摘要: A method and system for propagating the relevance of labeled documents to a query to unlabeled documents is provided. The propagation system provides training data that includes queries, documents labeled with their relevance to the queries, and unlabeled documents. The propagation system then calculates the similarity between pairs of documents in the training data. The propagation system then propagates the relevance of the labeled documents to similar, but unlabeled, documents. The propagation system may iteratively propagate labels of the documents until the labels converge on a solution. The training data with the propagated relevances can then be used to train a ranking function.

    摘要翻译: 提供了一种用于将标记的文档的相关性传播到未标记文档的查询的方法和系统。 传播系统提供包括查询,标记为与查询相关的文档以及未标记的文档的培训数据。 传播系统然后计算训练数据中文档对之间的相似度。 传播系统然后将标记的文档的相关性传播到类似但未标记的文档。 传播系统可以迭代地传播文档的标签,直到标签收敛在解决方案上。 然后可以使用具有传播相关性的训练数据来训练排序功能。

    Content Object Indexing Using Domain Knowledge
    38.
    发明申请
    Content Object Indexing Using Domain Knowledge 有权
    使用域知识的内容对象索引

    公开(公告)号:US20070162408A1

    公开(公告)日:2007-07-12

    申请号:US11275509

    申请日:2006-01-11

    IPC分类号: G06N5/02

    CPC分类号: G06F17/30613

    摘要: A content object indexing process including creating a content object knowledge index, calculating a description vector of a target content object, and indexing the target content object by searching for the description vector in the content object knowledge database. It may be difficult to search for an exact content object such as a music file or academic researcher as a conventional search index may not include related hierarchical information. A content object indexing process may add hierarchical information taken from a content object knowledge index and incorporate the hierarchical information to the index entry for a specific content object. An application of such a content object indexing process may be a world wide web search engine.

    摘要翻译: 内容对象索引处理包括创建内容对象知识索引,计算目标内容对象的描述向量,并通过搜索内容对象知识库中的描述向量来索引目标内容对象。 可能难以搜索诸如音乐文件或学术研究者的确切内容对象,因为传统的搜索索引可能不包括相关的分层信息。 内容对象索引处理可以添加从内容对象知识索引获取的分层信息,并且将分层信息并入特定内容对象的索引条目。 这样的内容对象索引处理的应用可以是万维网搜索引擎。

    Adding dominant media elements to search results
    39.
    发明申请
    Adding dominant media elements to search results 有权
    添加主要媒体元素以搜索结果

    公开(公告)号:US20060294068A1

    公开(公告)日:2006-12-28

    申请号:US11166775

    申请日:2005-06-24

    IPC分类号: G06F17/30

    摘要: A method and system for determining dominance of the media elements of display pages is provided. The dominance system provides a scoring mechanism for scoring the dominance of media elements of display pages based on features of each media element of the display page. To generate the scores for the media elements of the display page, the dominance system first identifies the media elements and then identifies the features of the media elements. The dominance system then scores the identified media elements using the provided scoring mechanism and the identified features.

    摘要翻译: 提供了一种用于确定显示页面的媒体元素优势的方法和系统。 优势系统提供了一种评分机制,用于基于显示页面的每个媒体元素的特征来评分显示页面的媒体元素的优势。 为了生成显示页面的媒体元素的分数,优势系统首先识别媒体元素,然后识别媒体元素的特征。 优势系统然后使用提供的评分机制和识别的特征对所识别的媒体元素进行评分。