USER-DRIVEN INDEX SELECTION
    1.
    发明申请
    USER-DRIVEN INDEX SELECTION 有权
    用户驱动索引选择

    公开(公告)号:US20110161260A1

    公开(公告)日:2011-06-30

    申请号:US12650285

    申请日:2009-12-30

    IPC分类号: G06F17/30 G06F15/18

    CPC分类号: G06F17/30887

    摘要: Techniques for index building are described. Clickcounts of respective training URLs may indicate a number of times that corresponding training URLs were clicked in search engine results. A machine learning algorithm implemented on a computer computes a trained model that is then stored. The clickcounts and respective URLs are passed to the machine learning algorithm to train the model to predict probabilities based on feature vectors of URLs. An index of web pages is built for a set of URLs that identify the web pages. Feature vectors for the URLs are computed. Probabilities of the web pages of the URLs being searched in the future by users may be computed by processing the feature vectors with the trained model. The probabilities may be used to determine which of the URLs to include in the index.

    摘要翻译: 描述了索引建立的技术。 相应培训URL的点击次数可以指示在搜索引擎结果中点击相应的培训URL的次数。 在计算机上实现的机器学习算法计算然后存储的训练模型。 点击次数和相应的URL被传递到机器学习算法,以基于URL的特征向量来训练模型来预测概率。 网页索引是为一组识别网页的网址而构建的。 计算URL的特征向量。 可能由用户将来搜索的URL的网页的概率通过用训练模型处理特征向量来计算。 概率可用于确定要包括在索引中的URL。

    User-driven index selection
    2.
    发明授权
    User-driven index selection 有权
    用户驱动的索引选择

    公开(公告)号:US08682811B2

    公开(公告)日:2014-03-25

    申请号:US12650285

    申请日:2009-12-30

    IPC分类号: G06F15/18

    CPC分类号: G06F17/30887

    摘要: Techniques for index building are described. Clickcounts of respective training URLs may indicate a number of times that corresponding training URLs were clicked in search engine results. A machine learning algorithm implemented on a computer computes a trained model that is then stored. The clickcounts and respective URLs are passed to the machine learning algorithm to train the model to predict probabilities based on feature vectors of URLs. An index of web pages is built for a set of URLs that identify the web pages. Feature vectors for the URLs are computed. Probabilities of the web pages of the URLs being searched in the future by users may be computed by processing the feature vectors with the trained model. The probabilities may be used to determine which of the URLs to include in the index.

    摘要翻译: 描述了索引建立的技术。 相应培训URL的点击次数可以指示在搜索引擎结果中点击相应的培训URL的次数。 在计算机上实现的机器学习算法计算然后存储的训练模型。 点击次数和相应的URL被传递到机器学习算法,以基于URL的特征向量来训练模型来预测概率。 网页索引是为一组识别网页的网址而构建的。 计算URL的特征向量。 可能由用户将来搜索的URL的网页的概率通过用训练模型处理特征向量来计算。 概率可用于确定要包括在索引中的URL。