摘要:
A system and method for prioritizing a fetch order of web pages. The method comprises extracting by a web crawler a set of candidate web pages to be crawled. Each web page in the set of candidate web pages is associated with a website in a computer network. A determination is made to determine if a first website score for the website is in a website score database. The first website score is associated with web pages in the set of candidate web pages if the first website score exists in the website score database. The set of candidate web pages is prioritized with respect to an associated website score for each web page in the candidate set of web pages. Content is retrieved from the set of candidate web. Hyperlinks are extracted from the content. The hyperlinks are stored in a memory unit.
摘要:
A system and method for prioritizing a fetch order of web pages. The method comprises extracting by a web crawler a set of candidate web pages to be crawled. Each web page in the set of candidate web pages is associated with a website in a computer network. A determination is made to determine if a first website score for the website is in a website score database. The first website score is associated with web pages in the set of candidate web pages if the first website score exists in the website score database. The set of candidate web pages is prioritized with respect to an associated website score for each web page in the candidate set of web pages. Content is retrieved from the set of candidate web. Hyperlinks are extracted from the content. The hyperlinks are stored in a memory unit.
摘要:
A system and method for prioritizing a fetch order of web pages. The method comprises extracting by a web crawler a set of candidate web pages to be crawled. Each web page in the set of candidate web pages is associated with a website in a computer network. A determination is made to determine if a first website score for the website is in a website score database. The first website score is associated with web pages in the set of candidate web pages if the first website score exists in the website score database. The set of candidate web pages is prioritized with respect to an associated website score for each web page in the candidate set of web pages. Content is retrieved from the set of candidate web. Hyperlinks are extracted from the content. The hyperlinks are stored in a memory unit.
摘要:
The techniques described herein provide software testing that may concurrently process a user request using a live version of software and a shadow request, which is based on the user request, using a shadow version of software (e.g., trial or test version, etc.). The live version of software, unlike the shadow version, is user-facing and transmits data back to the users while the shadow request does not output to the users. An allocation module may vary allocation of the shadow requests to enable a ramp up of allocations (or possibly ramp down) of the shadow version of software. The allocation module may use allocation rules to dynamically initiate the shadow request based on various factors such as load balancing, user attributes, and/or other rules or logic. Thus, not all user requests may be issued as shadow requests.
摘要:
This disclosure is directed in part to testing of different versions of software or software components (software versions) and analyzing results of use (e.g., user interaction) of the different software versions. The techniques described herein provide software testing that varies the allocation to enable a ramp up of allocations to/from another software version. The allocation module may use allocation rules to assign requests to each software version based on various factors such as load balancing, user attributes, past user assignment, and/or other rules or logic. An analysis of the different software versions may include an analysis of system performance resulting from operation of each software version. An analysis may determine attributes of each user and then allocate the user to a software version based on at least some of the determined attributes.
摘要:
This disclosure is directed to measuring test effects using adjusted outlier data. Test data and control data may include some outlier data (i.e., right-side tails of distribution curves), which may bias the resultant data. The outlier data may be adjusted to reduce bias. A cutoff point is selected along the distribution of data. Data below the cutoff is maintained and used to determine an effect of the data below the cutoff point. The effect of the data above the cutoff may be processed as follows. Predictor data is identified from the data below, but near, the cutoff point. The predictor data may then be used determine the effect of the outlier data that is above the cutoff point. In some embodiments, the predictor data may be weighted and combined with a weighted portion of the outlier data to determine an effect of the data above the cutoff point.