Using exceptional changes in webgraph snapshots over time for internet entity marking
    1.
    发明申请
    Using exceptional changes in webgraph snapshots over time for internet entity marking 有权
    随着时间的推移,使用网页快照的异常更改进行互联网实体标记

    公开(公告)号:US20070198603A1

    公开(公告)日:2007-08-23

    申请号:US11350967

    申请日:2006-02-08

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864

    摘要: Techniques are provided through which “suspicious” web pages may be identified automatically. A “suspicious” web page possesses characteristics that indicate some manipulation to artificially inflate the position of the web page within ranked search results. Web pages may be represented as nodes within a graph. Links between web pages may be represented as directed edges between the nodes. “Snapshots” of the current state of a network of interlinked web pages may be automatically generated at different times. In the time interval between snapshots, the state of the network may change. By comparing an earlier snapshot to a later snapshot, such changes can be identified. Extreme changes, which are deemed to vary significantly from the normal range of expected changes, can be detected automatically. Web pages relative to which these extreme changes have occurred may be marked as suspicious web pages which may merit further investigation or action.

    摘要翻译: 提供了可以自动识别“可疑”网页的技术。 “可疑”网页具有指示一些操纵以在排名搜索结果内人为地膨胀网页位置的特征。 网页可以表示为图中的节点。 网页之间的链接可以表示为节点之间的有向边。 互联网页网络当前状态的“快照”可能会在不同时间自动生成。 在快照之间的时间间隔内,网络的状态可能会改变。 通过将较早的快照与稍后的快照进行比较,可以识别这些更改。 可以自动检测到与预期变化的正常范围有显着差异的极端变化。 发生这些极端变化的网页可能被标记为可疑的网页,这可能值得进一步调查或采取行动。

    Using exceptional changes in webgraph snapshots over time for internet entity marking
    2.
    发明授权
    Using exceptional changes in webgraph snapshots over time for internet entity marking 有权
    随着时间的推移,使用网页快照的异常更改进行互联网实体标记

    公开(公告)号:US08429177B2

    公开(公告)日:2013-04-23

    申请号:US11350967

    申请日:2006-02-08

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30864

    摘要: Techniques are provided through which “suspicious” web pages may be identified automatically. A “suspicious” web page possesses characteristics that indicate some manipulation to artificially inflate the position of the web page within ranked search results. Web pages may be represented as nodes within a graph. Links between web pages may be represented as directed edges between the nodes. “Snapshots” of the current state of a network of interlinked web pages may be automatically generated at different times. In the time interval between snapshots, the state of the network may change. By comparing an earlier snapshot to a later snapshot, such changes can be identified. Extreme changes, which are deemed to vary significantly from the normal range of expected changes, can be detected automatically. Web pages relative to which these extreme changes have occurred may be marked as suspicious web pages which may merit further investigation or action.

    摘要翻译: 提供了可以自动识别“可疑”网页的技术。 “可疑”网页具有指示一些操纵以在排名搜索结果内人为地膨胀网页位置的特征。 网页可以表示为图中的节点。 网页之间的链接可以表示为节点之间的有向边。 互联网页网络当前状态的“快照”可能会在不同时间自动生成。 在快照之间的时间间隔内,网络的状态可能会改变。 通过将较早的快照与稍后的快照进行比较,可以识别这些更改。 可以自动检测到与预期变化的正常范围有显着差异的极端变化。 发生这些极端变化的网页可能被标记为可疑的网页,这可能值得进一步调查或采取行动。

    Consecutive crawling to identify transient links
    3.
    发明申请
    Consecutive crawling to identify transient links 审中-公开
    连续爬行以识别短暂的链接

    公开(公告)号:US20070226206A1

    公开(公告)日:2007-09-27

    申请号:US11388681

    申请日:2006-03-23

    IPC分类号: G06F17/30

    CPC分类号: G06F16/951

    摘要: According to the approach described herein, an approach is provided for identifying transient links on a Web page by crawling a Web page consecutively after a brief interval and comparing the links from each crawl to identify transient links. The approach ensures that transient links are not crawled and archived, thereby saving resources for crawling valid links leading to useful information

    摘要翻译: 根据本文描述的方法,提供了一种用于通过在短暂间隔之后连续爬行网页来识别网页上的瞬态链接并比较来自每个爬行的链接以识别瞬时链接的方法。 该方法确保临时链接不被爬网和归档,从而节省了用于爬行有效链接的资源,从而获得有用的信息

    Generating descriptions of matching resources based on the kind, quality, and relevance of available sources of information about the matching resources
    4.
    发明授权
    Generating descriptions of matching resources based on the kind, quality, and relevance of available sources of information about the matching resources 有权
    根据有关匹配资源的可用信息源的种类,质量和相关性,生成匹配资源的描述

    公开(公告)号:US07406458B1

    公开(公告)日:2008-07-29

    申请号:US10365273

    申请日:2003-02-11

    IPC分类号: G06F7/00 G06F17/30

    摘要: Techniques are provided for generating descriptions of matching resources in a manner that takes into account the kind, quality, and relevance of the available sources of information about the matching resources. For example, after the search engine identifies matching resources based on the query terms, the search engine determines the kinds of available sources of information about each matching resource. For each matching resource, based on the kinds of available sources of information about the matching resource, one of a plurality of processes is selected to generate a description for the matching resource. Using the content-sensitive description generation techniques described herein, a single result set may include abstracts that were generated using several different processes, where the difference in process corresponds to a difference in the kind, quality, and relevance of the available sources of information about each matching resource.

    摘要翻译: 提供了以考虑可用的关于匹配资源的信息源的种类,质量和相关性的方式产生匹配资源的描述的技术。 例如,在搜索引擎基于查询词语识别匹配资源之后,搜索引擎确定关于每个匹配资源的可用信息源的种类。 对于每个匹配资源,基于关于匹配资源的可用信息源的种类,选择多个进程之一来生成匹配资源的描述。 使用本文描述的内容敏感描述生成技术,单个结果集可以包括使用几个不同进程生成的抽象,其中过程的差异对应于可用的信息源的种类,质量和相关性的差异 每个匹配资源。

    PRODUCT NORMALIZATION
    6.
    发明申请
    PRODUCT NORMALIZATION 有权
    产品正常化

    公开(公告)号:US20120030205A1

    公开(公告)日:2012-02-02

    申请号:US13270885

    申请日:2011-10-11

    IPC分类号: G06F17/30

    摘要: A computer-implemented approach for organizing input listings from various sources of input listings. Input listings are organized by mapping the input listings to consolidated listing that correspond to the input listings. The mapping of the input listings are based on various techniques such as a Stock Keeping Unit item-listing-to-consolidated-listing matching technique, a name/title item-listing-to-consolidated-listing matching technique, and a model item-listing-to-consolidated-listing matching technique.

    摘要翻译: 一种用于从各种输入列表来源组织输入列表的计算机实现的方法。 通过将输入列表映射到与输入列表相对应的综合列表来组织输入列表。 输入列表的映射基于各种技术,例如库存保持单元项目列表到合并列表匹配技术,名称/标题项目列表到合并列表匹配技术,以及模型项目 - 上市合并列表匹配技术。

    Product normalization
    7.
    发明授权
    Product normalization 有权
    产品规范化

    公开(公告)号:US06853996B1

    公开(公告)日:2005-02-08

    申请号:US09925218

    申请日:2001-08-08

    IPC分类号: G06F17/30

    摘要: A computer-implemented approach for organizing input listings from various sources of input listings. Input listings are organized by mapping the input listings to consolidated listing that correspond to the input listings. The mapping of the input listings are based on various techniques such as a Stock Keeping Unit item-listing-to-consolidated-listing matching technique, a name/title item-listing-to-consolidated-listing matching technique, and a model item-listing-to-consolidated-listing matching technique.

    摘要翻译: 一种用于从各种输入列表来源组织输入列表的计算机实现的方法。 通过将输入列表映射到与输入列表相对应的综合列表来组织输入列表。 输入列表的映射基于各种技术,例如库存保持单元项目列表到合并列表匹配技术,名称/标题项目列表到合并列表匹配技术,以及模型项目 - 上市合并列表匹配技术。

    Product normalization
    8.
    发明授权
    Product normalization 有权
    产品规范化

    公开(公告)号:US08762361B2

    公开(公告)日:2014-06-24

    申请号:US13270885

    申请日:2011-10-11

    IPC分类号: G06F17/30

    摘要: A computer-implemented approach for organizing input listings from various sources of input listings. Input listings are organized by mapping the input listings to consolidated listing that correspond to the input listings. The mapping of the input listings are based on various techniques such as a Stock Keeping Unit item-listing-to-consolidated-listing matching technique, a name/title item-listing-to-consolidated-listing matching technique, and a model item-listing-to-consolidated-listing matching technique.

    摘要翻译: 一种用于从各种输入列表来源组织输入列表的计算机实现的方法。 通过将输入列表映射到与输入列表相对应的综合列表来组织输入列表。 输入列表的映射基于各种技术,例如库存保持单元项目列表到合并列表匹配技术,名称/标题项目列表到合并列表匹配技术,以及模型项目 - 上市合并列表匹配技术。

    DETECTION OF UNDESIRABLE WEB PAGES
    9.
    发明申请
    DETECTION OF UNDESIRABLE WEB PAGES 有权
    检测不可网页

    公开(公告)号:US20100094868A1

    公开(公告)日:2010-04-15

    申请号:US12248267

    申请日:2008-10-09

    IPC分类号: G06F7/10 G06F17/30

    CPC分类号: G06F17/30861

    摘要: A system for detecting artificial promotion of a resource, including a search engine operative to index a set incoming links (“inlinks”) which reference the resource, a log module coupled with the search engine and configured to store log data associated with the set of inlinks, a partitioning module coupled with log module and operative to partition the set of inlinks into a plurality of groups of inlinks based on at least one partitioning scheme, a statistics module coupled with the partitioning module and operative to compute a statistic associated with the inlinks within each of the plurality of groups of inlinks, and a computation module coupled with the statistics module and operative to process the computed statistic associated with the inlinks of each of the plurality of groups of inlinks and compute a metric associated with set of inlinks where the metric indicates a level of uniformity of a distribution of values of the respective computed statistics among the plurality of groups of inlinks, and where the search engine places a list of search results, generated in response to a search query, in a pattern based on the metric.

    摘要翻译: 一种用于检测资源的人为促进的系统,包括操作以索引引用所述资源的集合进入链接(“链接”)的搜索引擎,与所述搜索引擎耦合的日志模块,并且被配置为存储与所述一组 内联链接,与日志模块耦合并且可操作以基于至少一个分区方案将所述一组内联链路分割成多个联机组的分区模块,与所述分区模块耦合并可操作以计算与所述内联关联的统计量的统计模块 在所述多个在线组内的每一个中,以及计算模块,与所述统计模块耦合并且可操作以处理与所述多个联机组中的每一个的所述内联链接相关联的所计算的统计量,并且计算与一组内联链接相关联的度量, 度量表示多个o中的各个计算的统计的值的分布的均匀性的水平 f组的链接,以及搜索引擎在基于度量的模式中放置响应于搜索查询生成的搜索结果的列表。

    Product normalization
    10.
    发明授权
    Product normalization 有权
    产品规范化

    公开(公告)号:US07542964B2

    公开(公告)日:2009-06-02

    申请号:US11019130

    申请日:2004-12-22

    IPC分类号: G06F17/30

    摘要: A computer-implemented approach for organizing input listings from various sources of input listings. Input listings are organized by mapping the input listings to consolidated listing that correspond to the input listings. The mapping of the input listings are based on various techniques such as a Stock Keeping Unit item-listing-to-consolidated-listing matching technique, a name/title item-listing-to-consolidated-listing matching technique, and a model item-listing-to-consolidated-listing matching technique.

    摘要翻译: 一种用于从各种输入列表来源组织输入列表的计算机实现的方法。 通过将输入列表映射到与输入列表相对应的综合列表来组织输入列表。 输入列表的映射基于各种技术,例如库存保持单元项目列表到合并列表匹配技术,名称/标题项目列表到合并列表匹配技术,以及模型项目 - 上市合并列表匹配技术。