-
公开(公告)号:US07676553B1
公开(公告)日:2010-03-09
申请号:US10750011
申请日:2003-12-31
IPC分类号: G06F15/16
CPC分类号: G06F17/30864
摘要: A system and method facilitating incremental web crawl(s) using chunk(s) is provided. The system can be employed, for example, to facilitate a web-crawling system that crawls (e.g., continuously) the Internet for information (e.g., data) and indexes the information so that it can be used as part of a web search engine.The system facilitates incremental re-crawls and/or selective updating of information (e.g., documents) using a structure called a chunk to simplify the process of an incremental crawl. A chunk is a set of documents that can be manipulated as a set (e.g., of up to 65,536 (64K) documents). “Document” refers to a corpus of data that is stored at a particular URL (e.g., HTML, PDF, PS, PPT, XLS, and/or DOC Files etc.)A chunk is created by an indexer. The indexer can place into a chunk documents that have similar property(ies). These property(ies) include but are not limited to: average time between change and average importance. These property(ies) can be stored at the chunk level in a chunk map. The chunk map can then be employed (e.g., on a daily basis) to determine which chunk(s) should be re-crawled.
摘要翻译: 提供了一种使用块来促进增量Web爬网的系统和方法。 例如,该系统可以用于促进爬行(例如,连续地)互联网以用于信息(例如,数据)并且索引信息的网络爬行系统,使得其可以用作网络搜索引擎的一部分。 该系统有助于使用称为块的结构对信息(例如,文档)的增量重新爬行和/或选择性更新,以简化增量爬网的过程。 块是一组可以作为一组(例如最多65,536(64K)个文档)被操纵的文档)。 “文档”是指存储在特定URL(例如HTML,PDF,PS,PPT,XLS和/或DOC文件等)的数据语料库。索引器创建块。 索引器可以放入具有类似属性的块文档中。 这些财产包括但不限于:平均改变时间和平均重要性之间的时间。 这些属性可以存储在块图中的块图中。 然后可以使用块图(例如,每天)来确定应该重新爬行哪个块。
-
公开(公告)号:US07953631B1
公开(公告)日:2011-05-31
申请号:US10749653
申请日:2003-12-31
申请人: Kenneth A. Moss , Eric Watson , Eytan D. Seidman
发明人: Kenneth A. Moss , Eric Watson , Eytan D. Seidman
IPC分类号: G06Q30/00
CPC分类号: G06Q30/02 , G06F17/30867 , G06Q30/0242 , G06Q30/0244
摘要: The subject invention provides for systems and methods that visually enhance paid inclusion listings to facilitate offering a clear and substantial value to paid inclusion advertisers while retaining ordering rights to keep listings relevant to users. More specifically, the systems and methods allow paid inclusion listings to be visually modified at the discretion of the advertiser, the user, and/or the search service provider (e.g., publisher of search results) to facilitate differentiation among advertisers, companies, and the like. The ordering of the enhanced paid inclusion listings is not compromised based on the number or type of enhancement selected by the paid inclusion customer. A search service provider or search result publisher (“service provider”) can offer a plurality, or at least one, of different types of enhancements to paid inclusion customers (“advertisers”) to affect the rendering of any paid inclusion listing to the user.
摘要翻译: 主题发明提供了视觉上增强付费包含列表以促进向付费包含广告商提供清楚和实质价值的系统和方法,同时保留订单保持列表与用户相关的权利。 更具体地说,系统和方法允许在广告商,用户和/或搜索服务提供商(例如,搜索结果的发布者)的判断下视觉地修改付费的收录列表,以促进广告商,公司和 喜欢。 基于付费包含客户选择的增强数量或类型,增强付费收录列表的排序不会受到影响。 搜索服务提供商或搜索结果发布者(“服务提供商”)可以向付费包含客户(“广告商”)提供多种或至少一种不同类型的增强功能,以影响向用户呈现任何付费的收录列表 。
-