-
公开(公告)号:US20130226879A1
公开(公告)日:2013-08-29
申请号:US13434647
申请日:2012-03-29
IPC分类号: G06F17/30
CPC分类号: G06F17/30303 , G06F17/30286 , G06F21/60 , G06F21/6227 , G06F21/6245
摘要: A computer-implemented method for detecting a set of inconsistent data records in a database including multiple records, comprises selecting a data quality rule representing a functional dependency for the database, transforming the data quality rule into at least one rule vector with hashed components, selecting a set of attributes of the database, transforming at least one record of the database selected on the basis of the selected attributes into a record vector with hashed components, computing a dot product of the rule and record vectors to generate a measure representing violation of the data quality rule by the record.
摘要翻译: 一种用于在包括多个记录的数据库中检测一组不一致数据记录的计算机实现的方法,包括选择表示数据库的功能依赖性的数据质量规则,将数据质量规则转换成具有散列分量的至少一个规则向量,选择 数据库的一组属性,将基于所选择的属性选择的数据库的至少一个记录转换成具有散列分量的记录向量,计算规则的点乘积和记录向量,以生成表示违反 数据质量规则记录。
-
公开(公告)号:US09037550B2
公开(公告)日:2015-05-19
申请号:US13434647
申请日:2012-03-29
CPC分类号: G06F17/30303 , G06F17/30286 , G06F21/60 , G06F21/6227 , G06F21/6245
摘要: A computer-implemented method for detecting a set of inconsistent data records in a database including multiple records, comprises selecting a data quality rule representing a functional dependency for the database, transforming the data quality rule into at least one rule vector with hashed components, selecting a set of attributes of the database, transforming at least one record of the database selected on the basis of the selected attributes into a record vector with hashed components, computing a dot product of the rule and record vectors to generate a measure representing violation of the data quality rule by the record.
摘要翻译: 一种用于在包括多个记录的数据库中检测一组不一致数据记录的计算机实现的方法,包括选择表示数据库的功能依赖性的数据质量规则,将数据质量规则转换成具有散列分量的至少一个规则向量,选择 数据库的一组属性,将基于所选择的属性选择的数据库的至少一个记录转换成具有散列分量的记录向量,计算规则的点乘积和记录向量,以生成表示违反 数据质量规则记录。
-