发明申请
US20140149382A1 Technology for Web Site Crawling 有权
网站抓取技术

Technology for Web Site Crawling
摘要:
A web site page has a reference for providing an address for a next page. The web site is crawled by a crawler program, which parses the reference from one of the web pages and sends the reference to an applet running in a browser. The address for the next page is determined by the browser responsive to the reference and is sent to the crawler. The crawler selects non-hypertext-link parameters from the web page of the web site server by performing a programmed action sequence, including selecting items from lists of the web page in a particular sequence. The crawler sends the applet running in the browser, for the query to the web server for the next page referenced by the one web page, the selected parameters and a context arising from the particular sequence.
公开/授权文献
信息查询
0/0