发明申请
- 专利标题: Technology for Web Site Crawling
- 专利标题(中): 网站抓取技术
-
申请号: US14169115申请日: 2014-01-30
-
公开(公告)号: US20140149382A1公开(公告)日: 2014-05-29
- 发明人: Elizabeth A. Brodsky , Elmootazbellah N. Elnozahy , Ramakrishnan Rajamony
- 申请人: International Business Machines Corporation
- 申请人地址: US NY Armonk
- 专利权人: International Business Machines Corporation
- 当前专利权人: International Business Machines Corporation
- 当前专利权人地址: US NY Armonk
- 主分类号: G06F17/30
- IPC分类号: G06F17/30
摘要:
A web site page has a reference for providing an address for a next page. The web site is crawled by a crawler program, which parses the reference from one of the web pages and sends the reference to an applet running in a browser. The address for the next page is determined by the browser responsive to the reference and is sent to the crawler. The crawler selects non-hypertext-link parameters from the web page of the web site server by performing a programmed action sequence, including selecting items from lists of the web page in a particular sequence. The crawler sends the applet running in the browser, for the query to the web server for the next page referenced by the one web page, the selected parameters and a context arising from the particular sequence.
公开/授权文献
- US09165077B2 Technology for web site crawling 公开/授权日:2015-10-20
信息查询