Invention Application
US20100174686A1 Generating Equivalence Classes and Rules for Associating Content with Document Identifiers 有权
生成与文档标识符相关联的等价类和规则

Generating Equivalence Classes and Rules for Associating Content with Document Identifiers
Abstract:
A system of reducing the possibility of crawling duplicate document identifiers partitions a plurality of document identifiers into multiple clusters, each cluster having a cluster name and a set of document parameters. The system generates an equivalence rule for each cluster of document identifiers, the rule specifying which document parameters associated with the cluster are content-relevant. Next, the system groups each cluster of document identifiers into one or more equivalence classes in accordance with its associated equivalence rule, each equivalence class including one or more document identifiers that correspond to a document content and having a representative document identifier identifying the document content.
Information query
Patent Agency Ranking
0/0