专利检索 ap:("David Blackman" OR "Michael Ching" OR "Stephen Dill" OR "Ivan Gonzalez" OR "Adam Marcus" OR "Daniel Meredith" OR "Linda Nguyen") AND inv:"Ivan Gonzalez" 第 1 页

1.

发明申请
System and method for prioritizing websites during a webcrawling process 失效
标题翻译：在Web抓取过程中优先处理网站的系统和方法

公开(公告)号：US20070239701A1

公开(公告)日：2007-10-11

申请号：US11392856

申请日：2006-03-29

申请人： David Blackman , Michael Ching , Stephen Dill , Ivan Gonzalez , Adam Marcus , Daniel Meredith , Linda Nguyen

发明人： David Blackman , Michael Ching , Stephen Dill , Ivan Gonzalez , Adam Marcus , Daniel Meredith , Linda Nguyen

IPC分类号： G06F17/30

CPC分类号： G06F17/30864 , Y10S707/99934 , Y10S707/99935 , Y10S707/99936

摘要： A system and method for prioritizing a fetch order of web pages. The method comprises extracting by a web crawler a set of candidate web pages to be crawled. Each web page in the set of candidate web pages is associated with a website in a computer network. A determination is made to determine if a first website score for the website is in a website score database. The first website score is associated with web pages in the set of candidate web pages if the first website score exists in the website score database. The set of candidate web pages is prioritized with respect to an associated website score for each web page in the candidate set of web pages. Content is retrieved from the set of candidate web. Hyperlinks are extracted from the content. The hyperlinks are stored in a memory unit.

摘要翻译： 用于优先处理网页的获取顺序的系统和方法。该方法包括由网络爬行器提取要爬网的一组候选网页。候选网页集合中的每个网页与计算机网络中的网站相关联。确定确定网站的第一网站得分是否在网站得分数据库中。如果网站得分数据库中存在第一个网站分数，则第一个网站得分与该候选网页集中的网页相关联。候选网页的集合对于候选网页集合中的每个网页的相关网站评分是优先的。从候选网络集中检索内容。从内容中提取超链接。超链接存储在存储单元中。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类