Distributed crawling of hyperlinked documents
    1.
    发明授权
    Distributed crawling of hyperlinked documents 有权
    分布式抓取超链接文档

    公开(公告)号:US08812478B1

    公开(公告)日:2014-08-19

    申请号:US13608598

    申请日:2012-09-10

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864

    摘要: Techniques for crawling hyperlinked documents are provided. Hyperlinked documents to be crawled are grouped by host and the host to be crawled next is selected according to a stall time of the host. The stall time can indicate the earliest time that the host should be crawled and the stall times can be a predetermined amount of time, vary by host and be adjusted according to actual retrieval times from the host.

    摘要翻译: 提供了用于爬行超链接文档的技术。 要爬网的超链接文档按主机分组,根据主机的停顿时间选择下一次要抓取的主机。 停机时间可以指示主机应该被抓取的最早时间,并且停机时间可以是预定的时间量,由主机变化,并且根据主机的实际检索时间进行调整。

    Distributed crawling of hyperlinked documents
    2.
    发明授权
    Distributed crawling of hyperlinked documents 有权
    分布式抓取超链接文档

    公开(公告)号:US08266134B1

    公开(公告)日:2012-09-11

    申请号:US11923240

    申请日:2007-10-24

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864

    摘要: Techniques for crawling hyperlinked documents are provided. Hyperlinked documents to be crawled are grouped by host and the host to be crawled next is selected according to a stall time of the host. The stall time can indicate the earliest time that the host should be crawled and the stall times can be a predetermined amount of time, vary by host and be adjusted according to actual retrieval times from the host.

    摘要翻译: 提供了用于爬行超链接文档的技术。 要爬网的超链接文档按主机分组,根据主机的停机时间选择下一次要抓取的主机。 停机时间可以指示主机应该被抓取的最早时间,并且停机时间可以是预定的时间量,由主机变化,并且根据主机的实际检索时间进行调整。

    Distributed crawling of hyperlinked documents
    3.
    发明授权
    Distributed crawling of hyperlinked documents 有权
    分布式抓取超链接文档

    公开(公告)号:US07305610B1

    公开(公告)日:2007-12-04

    申请号:US09638082

    申请日:2000-08-14

    IPC分类号: G06F15/00 G06F17/00

    CPC分类号: G06F17/30864

    摘要: Techniques for crawling hyperlinked documents are provided. Hyperlinked documents to be crawled are grouped by host and the host to be crawled next is selected according to a stall time of the host. The stall time can indicate the earliest time that the host should be crawled and the stall times can be a predetermined amount of time, vary by host and be adjusted according to actual retrieval times from the host.

    摘要翻译: 提供了用于爬行超链接文档的技术。 要爬网的超链接文档按主机分组,根据主机的停机时间选择下一次要抓取的主机。 停机时间可以指示主机应该被抓取的最早时间,并且停机时间可以是预定的时间量,由主机变化,并且根据主机的实际检索时间进行调整。

    Systems and methods for modifying the order of links presented in a document
    4.
    发明授权
    Systems and methods for modifying the order of links presented in a document 有权
    用于修改文档中呈现的链接顺序的系统和方法

    公开(公告)号:US08522128B1

    公开(公告)日:2013-08-27

    申请号:US13250978

    申请日:2011-09-30

    IPC分类号: G06F17/21

    摘要: A system modifies documents to aid users in determining which of the entries in the documents to choose. The system identifies a document that includes one or more entries. The system determines a score for each of the entries and modifies the identified document, or entries in the identified document, based on the determined scores. The system then provides the modified document to a user.

    摘要翻译: 系统修改文档以帮助用户确定要在文档中选择哪些条目。 系统识别包含一个或多个条目的文档。 系统确定每个条目的分数,并根据确定的分数修改所识别的文档或所识别的文档中的条目。 然后系统将修改的文档提供给用户。

    Scoring links in a document
    5.
    发明授权
    Scoring links in a document 有权
    在文档中评分链接

    公开(公告)号:US08127220B1

    公开(公告)日:2012-02-28

    申请号:US09734883

    申请日:2000-12-13

    IPC分类号: G06F17/21

    摘要: A system modifies documents to aid users in determining which of the entries in the documents to choose. The system identifies a document that includes one or more entries. The system determines a score for each of the entries and modifies the identified document, or entries in the identified document, based on the determined scores. The system then provides the modified document to a user.

    摘要翻译: 系统修改文档以帮助用户确定要在文档中选择哪些条目。 系统识别包含一个或多个条目的文档。 系统确定每个条目的分数,并根据确定的分数修改所识别的文档或所识别的文档中的条目。 然后系统将修改的文档提供给用户。

    Modifying a source code file to reduce dependencies included therein
    6.
    发明授权
    Modifying a source code file to reduce dependencies included therein 有权
    修改源代码文件以减少其中包含的依赖关系

    公开(公告)号:US08677314B1

    公开(公告)日:2014-03-18

    申请号:US13213036

    申请日:2011-08-18

    IPC分类号: G06F9/44

    CPC分类号: G06F8/443 G06F8/51

    摘要: A system and machine-implemented method modifying a source code file to reduce dependencies included therein. The source code file is parsed to identify a symbol within the source code file, and one or more header files are identified, each of which is capable of resolving the symbol for the source code file. A header file is selected from the one or more header files for inclusion in the source code file, based on a predetermined set of rules. The source code file is modified to include the selected header file.

    摘要翻译: 一种修改源代码文件以减少其中包含的依赖性的系统和机器实现的方法。 源代码文件被解析以识别源代码文件中的符号,并且识别一个或多个头文件,每个头文件能够解析源代码文件的符号。 基于预定的一组规则,从一个或多个头文件中选择头文件以包括在源代码文件中。 源代码文件被修改为包括所选的头文件。