-
公开(公告)号:US11688494B2
公开(公告)日:2023-06-27
申请号:US16139678
申请日:2018-09-24
Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor: Ying Xue Li , Wen Sun , Jing Mei , Yi Qin Yu , Bibo Hao , Jian Min Jiang , Guo Tong Xie
CPC classification number: G16H10/60 , G06F16/285 , G16H15/00
Abstract: The disclosure provides a method for data instance processing. The method includes obtaining a set of data instances collected from a plurality of organizations. Each of the data instances includes at least one record formed in an organization that stores values of a plurality of attributes of the data instance. The method also includes dividing the set of data instances into groups, wherein data instances with conflicting values for the same attribute are divided into different groups. The method further includes subdividing data instances in each of the groups into clusters.
-
公开(公告)号:US20190130007A1
公开(公告)日:2019-05-02
申请号:US15798600
申请日:2017-10-31
Applicant: International Business Machines Corporation
Inventor: Bibo Hao , Jian Min Jiang , Ying Xue Li , Wen Sun , Guo Tong Xie , Yi Qin Yu
Abstract: Techniques are provided that facilitate determining, by a system operatively coupled to a processor, respective performance scores for a first set of candidate transformation scripts based on a performance criterion, wherein the candidate transformation scripts are related to extract, transform, load (ETL) processing of a new data source to a data target. Techniques are also provided that facilitate generating, by the system, a recommendation of one or more of the first set of candidate transformation scripts based on the respective performance scores for performance of the ETL processing.
-
公开(公告)号:US20200098453A1
公开(公告)日:2020-03-26
申请号:US16139678
申请日:2018-09-24
Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor: Ying Xue Li , Wen Sun , Jing Mei , Yi Qin Yu , Bibo Hao , Jian Min Jiang , Guo Tong Xie
Abstract: The disclosure provides a method for data instance processing. The method includes obtaining a set of data instances collected from a plurality of organizations. Each of the data instances includes at least one record formed in an organization that stores values of a plurality of attributes of the data instance. The method also includes dividing the set of data instances into groups, wherein data instances with conflicting values for the same attribute are divided into different groups. The method further includes subdividing data instances in each of the groups into clusters.
-
-