-
公开(公告)号:US20100318531A1
公开(公告)日:2010-12-16
申请号:US12481593
申请日:2009-06-10
申请人: Jianfeng Gao , Xiao Li , Kefeng Deng , Wei Yuan , Jian-Yun Nie
发明人: Jianfeng Gao , Xiao Li , Kefeng Deng , Wei Yuan , Jian-Yun Nie
CPC分类号: G06F16/337 , G06F16/335 , G06F16/355
摘要: Described is a technology for using clickthrough data (e.g., based on data of a query log) in learning a ranking model that may be used in online ranking of search results. Clickthrough data, which is typically sparse (because many documents are often not clicked or rarely clicked), is processed/smoothed into smoothed clickthrough streams. The processing includes determining similar queries for a document with incomplete (insufficient) clickthrough data to provide expanded clickthrough data for that document, and/or by estimating at least one clickthrough feature for a document when that document has missing (e.g., no) clickthrough data. Similar queries may be determined by random walk clustering and/or session-based query analysis. Features extracted from the clickthrough streams may be used to provide a ranking model which may then be used in online ranking of documents that are located with respect to a query.
摘要翻译: 描述了一种用于在学习可用于搜索结果的在线排名中的排名模型的点击数据(例如,基于查询日志的数据)的技术。 点击数据通常是稀疏的(因为许多文档经常没有点击或很少点击)被处理/平滑到平滑的点击流中。 该处理包括确定具有不完整(不足够的)点击数据的文档的类似查询,以便为该文档提供扩展的点击数据,和/或通过在该文档缺少(例如,否))点击数据时估计文档的至少一个点击特征 。 可以通过随机游走聚类和/或基于会话的查询分析来确定类似的查询。 从点击流中提取的特征可以用于提供排序模型,然后可以在相对于查询定位的文档的在线排名中使用排名模型。
-
公开(公告)号:US07725442B2
公开(公告)日:2010-05-25
申请号:US11672038
申请日:2007-02-06
申请人: Chin-Yew Lin , Jianfeng Gao , Guihong Cao , Jian-Yun Nie
发明人: Chin-Yew Lin , Jianfeng Gao , Guihong Cao , Jian-Yun Nie
CPC分类号: G06F17/30719
摘要: A probability distribution for a reference summary of a document is determined. The probability distribution for the reference summary is then used to generate a score for a machine-generated summary of the document.
摘要翻译: 确定文档参考摘要的概率分布。 然后使用参考摘要的概率分布来生成机器生成的文档摘要的分数。
-
公开(公告)号:US07523105B2
公开(公告)日:2009-04-21
申请号:US11276308
申请日:2006-02-23
申请人: Ji-Rong Wen , Jian-Yun Nie , Mingjing Li , Hong-Jiang Zhang
发明人: Ji-Rong Wen , Jian-Yun Nie , Mingjing Li , Hong-Jiang Zhang
IPC分类号: G06F17/30
CPC分类号: G06F17/30654 , G06F17/30867 , Y10S707/99932 , Y10S707/99934 , Y10S707/99935 , Y10S707/99936 , Y10S707/99937 , Y10S707/99938 , Y10S707/99939
摘要: Systems and methods for clustering Web queries are described. In one aspect, one or more of a same document and a plurality of similar documents selected by a user in response to a plurality of queries is identified. Responsive to this identification, a query cluster is generated. The cleric the query cluster indicates that the queries are similar independent of whether individual ones of the queries comprise similar composition with respect to other ones of the queries.
摘要翻译: 描述用于集群Web查询的系统和方法。 在一个方面,识别由用户响应于多个查询而选择的相同文档和多个类似文档中的一个或多个。 响应于此标识,生成查询集群。 查询集群的牧师表示查询是相似的,独立于查询中的个别查询是否包含与其他查询相似的组合。
-
公开(公告)号:US07149732B2
公开(公告)日:2006-12-12
申请号:US09977171
申请日:2001-10-12
申请人: Ji-Rong Wen , Jian-Yun Nie , Ming-Jing Li , Hong-Jiang Zhang
发明人: Ji-Rong Wen , Jian-Yun Nie , Ming-Jing Li , Hong-Jiang Zhang
IPC分类号: G06F17/30
CPC分类号: G06F17/30654 , G06F17/30867 , Y10S707/99932 , Y10S707/99934 , Y10S707/99935 , Y10S707/99936 , Y10S707/99937 , Y10S707/99938 , Y10S707/99939
摘要: The described subject matter provides systems and procedures to make query similarity determinations, wherein the queries are used in information retrieval operations. A same document and/or multiple similar documents are identified that have been selected by a user in response to multiple queries. Responsive to identifying the same document and/or the similar documents, a query cluster is generated that indicates that the queries used to obtain the same and/or similar documents. This is accomplished in a manner that is independent of whether individual ones of the queries are compositionally similar with respect to other ones of the queries.
-
公开(公告)号:US20060136455A1
公开(公告)日:2006-06-22
申请号:US11276308
申请日:2006-02-23
申请人: Ji-Rong Wen , Jian-Yun Nie , Ming-Jing Li , Hong-Jiang Zhang
发明人: Ji-Rong Wen , Jian-Yun Nie , Ming-Jing Li , Hong-Jiang Zhang
IPC分类号: G06F17/00
CPC分类号: G06F17/30654 , G06F17/30867 , Y10S707/99932 , Y10S707/99934 , Y10S707/99935 , Y10S707/99936 , Y10S707/99937 , Y10S707/99938 , Y10S707/99939
摘要: Systems and methods for clustering Web queries are described. In one aspect, one or more of a same document and a plurality of similar documents selected by a user in response to a plurality of queries is identified. Responsive to this identification, a query cluster is generated. The cleric the query cluster indicates that the queries are similar independent of whether individual ones of the queries comprise similar composition with respect to other ones of the queries.
摘要翻译: 描述用于集群Web查询的系统和方法。 在一个方面,识别由用户响应于多个查询而选择的相同文档和多个类似文档中的一个或多个。 响应于此标识,生成查询集群。 查询集群的牧师表示查询是相似的,独立于查询中的个别查询是否包含与其他查询相似的组合。
-
公开(公告)号:US20080189074A1
公开(公告)日:2008-08-07
申请号:US11672038
申请日:2007-02-06
申请人: Chin-Yew Lin , Jianfeng Gao , Guihong Cao , Jian-Yun Nie
发明人: Chin-Yew Lin , Jianfeng Gao , Guihong Cao , Jian-Yun Nie
IPC分类号: G06F17/18
CPC分类号: G06F17/30719
摘要: A probability distribution for a reference summary of a document is determined. The probability distribution for the reference summary is then used to generate a score for a machine-generated summary of the document.
摘要翻译: 确定文档参考摘要的概率分布。 然后使用参考摘要的概率分布来生成机器生成的文档摘要的分数。
-
-
-
-
-