SMOOTHING CLICKTHROUGH DATA FOR WEB SEARCH RANKING
    1.
    发明申请
    SMOOTHING CLICKTHROUGH DATA FOR WEB SEARCH RANKING 审中-公开
    用于网络搜索排名的平滑点击数据

    公开(公告)号:US20100318531A1

    公开(公告)日:2010-12-16

    申请号:US12481593

    申请日:2009-06-10

    IPC分类号: G06F17/30 G06F15/18

    摘要: Described is a technology for using clickthrough data (e.g., based on data of a query log) in learning a ranking model that may be used in online ranking of search results. Clickthrough data, which is typically sparse (because many documents are often not clicked or rarely clicked), is processed/smoothed into smoothed clickthrough streams. The processing includes determining similar queries for a document with incomplete (insufficient) clickthrough data to provide expanded clickthrough data for that document, and/or by estimating at least one clickthrough feature for a document when that document has missing (e.g., no) clickthrough data. Similar queries may be determined by random walk clustering and/or session-based query analysis. Features extracted from the clickthrough streams may be used to provide a ranking model which may then be used in online ranking of documents that are located with respect to a query.

    摘要翻译: 描述了一种用于在学习可用于搜索结果的在线排名中的排名模型的点击数据(例如,基于查询日志的数据)的技术。 点击数据通常是稀疏的(因为许多文档经常没有点击或很少点击)被处理/平滑到平滑的点击流中。 该处理包括确定具有不完整(不足够的)点击数据的文档的类似查询,以便为该文档提供扩展的点击数据,和/或通过在该文档缺少(例如,否))点击数据时估计文档的至少一个点击特征 。 可以通过随机游走聚类和/或基于会话的查询分析来确定类似的查询。 从点击流中提取的特征可以用于提供排序模型,然后可以在相对于查询定位的文档的在线排名中使用排名模型。