System and method for prioritizing websites during a webcrawling process
    1.
    发明授权
    System and method for prioritizing websites during a webcrawling process 失效
    在Web抓取过程中优先处理网站的系统和方法

    公开(公告)号:US07966337B2

    公开(公告)日:2011-06-21

    申请号:US12143885

    申请日:2008-06-23

    IPC分类号: G06F17/30

    摘要: A system and method for prioritizing a fetch order of web pages. The method comprises extracting by a web crawler a set of candidate web pages to be crawled. Each web page in the set of candidate web pages is associated with a website in a computer network. A determination is made to determine if a first website score for the website is in a website score database. The first website score is associated with web pages in the set of candidate web pages if the first website score exists in the website score database. The set of candidate web pages is prioritized with respect to an associated website score for each web page in the candidate set of web pages. Content is retrieved from the set of candidate web. Hyperlinks are extracted from the content. The hyperlinks are stored in a memory unit.

    摘要翻译: 用于优先处理网页的获取顺序的系统和方法。 该方法包括由网络爬行器提取要爬网的一组候选网页。 候选网页集合中的每个网页与计算机网络中的网站相关联。 确定确定网站的第一网站得分是否在网站得分数据库中。 如果网站得分数据库中存在第一个网站分数,则第一个网站得分与该候选网页集中的网页相关联。 候选网页的集合对于候选网页集合中的每个网页的相关网站评分是优先的。 从候选网络集中检索内容。 从内容中提取超链接。 超链接存储在存储单元中。

    System and method for prioritizing websites during a webcrawling process
    2.
    发明授权
    System and method for prioritizing websites during a webcrawling process 失效
    在Web抓取过程中优先处理网站的系统和方法

    公开(公告)号:US07475069B2

    公开(公告)日:2009-01-06

    申请号:US11392856

    申请日:2006-03-29

    IPC分类号: G06F17/30

    摘要: A system and method for prioritizing a fetch order of web pages. The method comprises extracting by a web crawler a set of candidate web pages to be crawled. Each web page in the set of candidate web pages is associated with a website in a computer network. A determination is made to determine if a first website score for the website is in a website score database. The first website score is associated with web pages in the set of candidate web pages if the first website score exists in the website score database. The set of candidate web pages is prioritized with respect to an associated website score for each web page in the candidate set of web pages. Content is retrieved from the set of candidate web. Hyperlinks are extracted from the content. The hyperlinks are stored in a memory unit.

    摘要翻译: 用于优先处理网页的获取顺序的系统和方法。 该方法包括由网络爬行器提取要爬网的一组候选网页。 候选网页集合中的每个网页与计算机网络中的网站相关联。 确定确定网站的第一网站得分是否在网站得分数据库中。 如果网站得分数据库中存在第一个网站分数,则第一个网站得分与该候选网页集中的网页相关联。 候选网页的集合对于候选网页集合中的每个网页的相关网站评分是优先的。 从候选网络集中检索内容。 从内容中提取超链接。 超链接存储在存储单元中。

    SYSTEM AND METHOD FOR PRIORITIZING WEBSITES DURING A WEBCRAWLING PROCESS
    3.
    发明申请
    SYSTEM AND METHOD FOR PRIORITIZING WEBSITES DURING A WEBCRAWLING PROCESS 失效
    在WEBCRAWLING过程中优化网站的系统和方法

    公开(公告)号:US20080256046A1

    公开(公告)日:2008-10-16

    申请号:US12143885

    申请日:2008-06-23

    IPC分类号: G06F7/06 G06F17/30

    摘要: A system and method for prioritizing a fetch order of web pages. The method comprises extracting by a web crawler a set of candidate web pages to be crawled. Each web page in the set of candidate web pages is associated with a website in a computer network. A determination is made to determine if a first website score for the website is in a website score database. The first website score is associated with web pages in the set of candidate web pages if the first website score exists in the website score database. The set of candidate web pages is prioritized with respect to an associated website score for each web page in the candidate set of web pages. Content is retrieved from the set of candidate web. Hyperlinks are extracted from the content. The hyperlinks are stored in a memory unit.

    摘要翻译: 用于优先处理网页的获取顺序的系统和方法。 该方法包括由网络爬行器提取要爬网的一组候选网页。 候选网页集合中的每个网页与计算机网络中的网站相关联。 确定确定网站的第一网站得分是否在网站得分数据库中。 如果网站得分数据库中存在第一个网站分数,则第一个网站得分与该候选网页集中的网页相关联。 候选网页的集合对于候选网页集合中的每个网页的相关网站评分是优先的。 从候选网络集中检索内容。 从内容中提取超链接。 超链接存储在存储单元中。

    Software testing using shadow requests
    4.
    发明授权
    Software testing using shadow requests 有权
    软件测试使用影子请求

    公开(公告)号:US09058428B1

    公开(公告)日:2015-06-16

    申请号:US13445562

    申请日:2012-04-12

    IPC分类号: G06F9/44 G06F11/36

    摘要: The techniques described herein provide software testing that may concurrently process a user request using a live version of software and a shadow request, which is based on the user request, using a shadow version of software (e.g., trial or test version, etc.). The live version of software, unlike the shadow version, is user-facing and transmits data back to the users while the shadow request does not output to the users. An allocation module may vary allocation of the shadow requests to enable a ramp up of allocations (or possibly ramp down) of the shadow version of software. The allocation module may use allocation rules to dynamically initiate the shadow request based on various factors such as load balancing, user attributes, and/or other rules or logic. Thus, not all user requests may be issued as shadow requests.

    摘要翻译: 本文描述的技术提供软件测试,其可以使用软件的影子版本(例如,试用版或测试版等)使用基于用户请求的软件的实时版本和影像请求来同时处理用户请求, 。 与阴影版本不同,软件的实时版本是面向用户的,并且在影子请求不输出到用户的同时将数据发送回用户。 分配模块可以改变影子请求的分配,以使得软件的影子版本的分配(或可能降低)的斜坡上升。 分配模块可以使用分配规则来基于诸如负载平衡,用户属性和/或其他规则或逻辑的各种因素动态地发起影子请求。 因此,并不是所有用户请求都可以作为影子请求发出。

    Software testing analysis and control
    5.
    发明授权
    Software testing analysis and control 有权
    软件测试分析与控制

    公开(公告)号:US09268663B1

    公开(公告)日:2016-02-23

    申请号:US13445482

    申请日:2012-04-12

    IPC分类号: G06F9/44 G06F11/34 G06F11/36

    摘要: This disclosure is directed in part to testing of different versions of software or software components (software versions) and analyzing results of use (e.g., user interaction) of the different software versions. The techniques described herein provide software testing that varies the allocation to enable a ramp up of allocations to/from another software version. The allocation module may use allocation rules to assign requests to each software version based on various factors such as load balancing, user attributes, past user assignment, and/or other rules or logic. An analysis of the different software versions may include an analysis of system performance resulting from operation of each software version. An analysis may determine attributes of each user and then allocate the user to a software version based on at least some of the determined attributes.

    摘要翻译: 本公开部分地针对测试不同版本的软件或软件组件(软件版本)并分析不同软件版本的使用结果(例如,用户交互)。 本文描述的技术提供了改变分配的软件测试,以便能够向/从另一个软件版本分配增加。 分配模块可以使用分配规则来基于诸如负载平衡,用户属性,过去的用户分配和/或其他规则或逻辑的各种因素来向每个软件版本分配请求。 对不同软件版本的分析可能包括对每个软件版本的操作产生的系统性能的分析。 分析可以确定每个用户的属性,然后基于所确定的属性中的至少一些将用户分配给软件版本。

    Measuring test effects using adjusted outlier data
    6.
    发明授权
    Measuring test effects using adjusted outlier data 有权
    使用调整的异常值数据测量测试效果

    公开(公告)号:US08732528B1

    公开(公告)日:2014-05-20

    申请号:US13345378

    申请日:2012-01-06

    IPC分类号: G06F11/00

    摘要: This disclosure is directed to measuring test effects using adjusted outlier data. Test data and control data may include some outlier data (i.e., right-side tails of distribution curves), which may bias the resultant data. The outlier data may be adjusted to reduce bias. A cutoff point is selected along the distribution of data. Data below the cutoff is maintained and used to determine an effect of the data below the cutoff point. The effect of the data above the cutoff may be processed as follows. Predictor data is identified from the data below, but near, the cutoff point. The predictor data may then be used determine the effect of the outlier data that is above the cutoff point. In some embodiments, the predictor data may be weighted and combined with a weighted portion of the outlier data to determine an effect of the data above the cutoff point.

    摘要翻译: 本公开涉及使用调整的异常值数据测量测试效果。 测试数据和控制数据可以包括一些异常值数据(即,分布曲线的右侧尾部),其可能偏移所得到的数据。 可以调整异常值数据以减少偏差。 沿着数据分布选择一个截止点。 保留低于截止值的数据并用于确定数据低于截止点的影响。 可以如下处理截止值以上的数据的效果。 预测数据从下面的数据识别,但是在临界点附近。 然后可以使用预测器数据来确定高于截止点的离群值数据的影响。 在一些实施例中,预测器数据可以被加权并与异常值数据的加权部分组合以确定在截止点之上的数据的影响。