- 专利标题: DISCOVERY OF LINKAGE POINTS BETWEEN DATA SOURCES
-
申请号: US16794895申请日: 2020-02-19
-
公开(公告)号: US20200183995A1公开(公告)日: 2020-06-11
- 发明人: Oktie Hassanzadeh , Mauricio A. Hernandez-Sherrington , Ching-Tien Ho , Lucian Popa
- 申请人: INTERNATIONAL BUSINESS MACHINES CORPORATION
- 主分类号: G06F16/9535
- IPC分类号: G06F16/9535 ; G06F16/25 ; G06F16/27 ; G06F16/2457
摘要:
Data records are linked across a plurality of datasets. Each dataset contains at least one data record, and each data record is associated with an entity and includes one or more attributes of that entity and a value for each attribute. Values associated with attributes are compared across datasets, and matching attributes having values that satisfy a predetermined similarity threshold are identified. In addition, linkage points between pairs of datasets are identified. Each linkage point links one or more pairs of data records. Each data record in the pair of data records is contained in one of a given pair of datasets, and each pair of data records is associated with a common entity having matching attributes in the given pair of datasets. Data records associated with the common entities are linked across datasets using the identified linkage points.
公开/授权文献
- US11531717B2 Discovery of linkage points between data sources 公开/授权日:2022-12-20
信息查询