Invention Application
- Patent Title: Generating Equivalence Classes and Rules for Associating Content with Document Identifiers
- Patent Title (中): 生成与文档标识符相关联的等价类和规则
-
Application No.: US12725381Application Date: 2010-03-16
-
Publication No.: US20100174686A1Publication Date: 2010-07-08
- Inventor: Anurag Acharya , Arvind Jain , Arup Mukherjee
- Applicant: Anurag Acharya , Arvind Jain , Arup Mukherjee
- Main IPC: G06F17/30
- IPC: G06F17/30 ; G06F7/00

Abstract:
A system of reducing the possibility of crawling duplicate document identifiers partitions a plurality of document identifiers into multiple clusters, each cluster having a cluster name and a set of document parameters. The system generates an equivalence rule for each cluster of document identifiers, the rule specifying which document parameters associated with the cluster are content-relevant. Next, the system groups each cluster of document identifiers into one or more equivalence classes in accordance with its associated equivalence rule, each equivalence class including one or more document identifiers that correspond to a document content and having a representative document identifier identifying the document content.
Public/Granted literature
- US09026566B2 Generating equivalence classes and rules for associating content with document identifiers Public/Granted day:2015-05-05
Information query