Invention Grant
US08311959B2 System and method for classifying data streams with very large cardinality
失效
用于分类具有非常大基数的数据流的系统和方法
- Patent Title: System and method for classifying data streams with very large cardinality
- Patent Title (中): 用于分类具有非常大基数的数据流的系统和方法
-
Application No.: US13400863Application Date: 2012-02-21
-
Publication No.: US08311959B2Publication Date: 2012-11-13
- Inventor: Charu C Aggarwal , Philip S Yu
- Applicant: Charu C Aggarwal , Philip S Yu
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: August Law, LLC
- Agent George Willinghan
- Main IPC: G06F15/18
- IPC: G06F15/18

Abstract:
An object and attributes that describe that object are identified. The attributes are grouped into attribute patterns, and classification classes are identified. For each identified class a sketch table containing a plurality of parallel hash tables is created. For the object to be classified, each attribute pattern is processed using the all of the hash functions for each sketch table, resulting in a plurality of values under each sketch table for a single attribute pattern. The lowest value is selected for each sketch table. The distribution of values across all sketch tables is evaluated for each attribute pattern, producing a discriminatory power for each attribute pattern. Attribute patterns having a discriminatory power above a given threshold are selected and added to the associated sketch table values. The sketch table with the largest overall sum is identified, and the associated class is assigned to the object belonging to the attribute patterns.
Public/Granted literature
- US20120166382A1 System and Method for Classifying Data Streams with Very Large Cardinality Public/Granted day:2012-06-28
Information query