SYSTEMS AND METHODS FOR DETERMINING DOCUMENT FRESHNESS
    3.
    发明申请
    SYSTEMS AND METHODS FOR DETERMINING DOCUMENT FRESHNESS 有权
    用于确定文件新鲜度的系统和方法

    公开(公告)号:US20120089619A1

    公开(公告)日:2012-04-12

    申请号:US13329938

    申请日:2011-12-19

    申请人: Monika HENZINGER

    发明人: Monika HENZINGER

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864

    摘要: A system determines a freshness of a first document. The system determines whether a freshness attribute is associated with the first document. The system identifies, based on the determination, a set of second documents that each contain a link to the first document. The system assigns a freshness score to the first document based on a freshness attribute associated with each document of the set of second documents or the freshness attribute associated with the first document.

    摘要翻译: 系统确定第一文档的新鲜度。 系统确定新鲜度属性是否与第一个文档相关联。 该系统基于确定识别一组第二文档,每个第二文档包含指向第一文档的链接。 该系统基于与第二文档集合中的每个文档或与第一文档相关联的新鲜度属性相关联的新鲜度属性向第一文档分配新鲜度分数。

    Systems and methods for determining document freshness
    7.
    发明授权
    Systems and methods for determining document freshness 有权
    确定文件新鲜度的系统和方法

    公开(公告)号:US08082244B2

    公开(公告)日:2011-12-20

    申请号:US12854727

    申请日:2010-08-11

    申请人: Monika Henzinger

    发明人: Monika Henzinger

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864

    摘要: A system determines a freshness of a first document. The system determines whether a freshness attribute is associated with the first document. The system identifies, based on the determination, a set of second documents that each contain a link to the first document. The system assigns a freshness score to the first document based on a freshness attribute associated with each document of the set of second documents or the freshness attribute associated with the first document.

    摘要翻译: 系统确定第一文档的新鲜度。 系统确定新鲜度属性是否与第一个文档相关联。 该系统基于确定识别一组第二文档,每个第二文档包含指向第一文档的链接。 该系统基于与第二文档集合中的每个文档或与第一文档相关联的新鲜度属性相关联的新鲜度属性向第一文档分配新鲜度分数。

    Identification of web sites that contain session identifiers
    8.
    发明授权
    Identification of web sites that contain session identifiers 有权
    识别包含会话标识符的网站

    公开(公告)号:US07886217B1

    公开(公告)日:2011-02-08

    申请号:US10672248

    申请日:2003-09-29

    IPC分类号: G06F17/00

    CPC分类号: G06F17/30899 G06F17/3089

    摘要: Web sites are analyzed to determine whether the web sites are embedding session identifiers in web documents. The analysis is based on a comparison of in-host links of multiple copies of a document from a web site. Rules governing the insertion of session identifiers for the web site may be determined and used to assist in crawling the web site.

    摘要翻译: 分析网站以确定网站是否在Web文档中嵌入会话标识符。 分析基于来自网站的文档的多个副本的主机间链接的比较。 可以确定用于网站插入会话标识符的规则并用于协助爬网。