- 专利标题: Bulk deduplication detection
-
申请号: US15052382申请日: 2016-02-24
-
公开(公告)号: US10152497B2公开(公告)日: 2018-12-11
- 发明人: Dai Duong Doan , Arun Kumar Jagota , Chenghung Ker , Parth Vaishnav , Danil Dvinov , Dmytro Kudriavtsev
- 申请人: salesforce.com, inc.
- 申请人地址: US CA San Francisco
- 专利权人: salesforce.com, inc.
- 当前专利权人: salesforce.com, inc.
- 当前专利权人地址: US CA San Francisco
- 代理机构: Dergosits & Noah LLP
- 代理商 Todd A. Noah
- 主分类号: G06F17/30
- IPC分类号: G06F17/30 ; G06F7/32
摘要:
Some embodiments of the present invention include a system and method for removing duplicate records from a group of records in a database system. The method includes generating a first cluster of records from the group of records, generating a second cluster of records from the group of records, identifying sets of duplicate records in the first cluster of records, and identifying sets of duplicate records in the second cluster of records. The method also includes merging at least two sets of duplicate records associated with both the first cluster and the second cluster of records to form a merged set of duplicate records. The merging is performed based on the at least two sets of duplicate records having a common record. Duplicate records in the group of records may then be removed by removing duplicate records from the merged set of duplicate records.
公开/授权文献
- US20170242868A1 BULK DEDUPLICATION DETECTION 公开/授权日:2017-08-24
信息查询