Invention Grant
- Patent Title: Column weight calculation for data deduplication
-
Application No.: US15171200Application Date: 2016-06-02
-
Publication No.: US10452627B2Publication Date: 2019-10-22
- Inventor: Namit Kabra , Yannick Saillet
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agent Peter K. Suchecki
- Main IPC: G06F7/00
- IPC: G06F7/00 ; G06F16/215 ; G06F16/21 ; G06F16/174

Abstract:
A computer system with the capability to identify potentially duplicative records in a data set is provided. A computer may collect a data profile for the data set that provides descriptive information with regard to attributes of the data set. Based, at least in part, on the data profile, weights are determined for the attributes. As values of a data record are compared to values of the same respective attributes in other records, the overall likelihood of a match or duplicate, as indicated by the degree of similarity between values, is modified based on the determined weights associated with the respective attributes.
Public/Granted literature
- US20170351717A1 COLUMN WEIGHT CALCULATION FOR DATA DEDUPLICATION Public/Granted day:2017-12-07
Information query