METADATA SEARCH BASED ON SEMANTICS
    1.
    发明申请
    METADATA SEARCH BASED ON SEMANTICS 有权
    基于语义的元数据搜索

    公开(公告)号:US20150213021A1

    公开(公告)日:2015-07-30

    申请号:US14167424

    申请日:2014-01-29

    IPC分类号: G06F17/30

    摘要: According to some embodiments, a method and an apparatus of enriching search results with metadata are provided to receive a plurality of metadata associated with an entity and storing the plurality of metadata in a repository. A search request associated with the entity is received and search results that comprise a portion of the plurality of metadata stored in the repository are determined.

    摘要翻译: 根据一些实施例,提供了利用元数据来丰富搜索结果的方法和装置,以接收与实体相关联的多个元数据并将多个元数据存储在存储库中。 接收与实体相关联的搜索请求,并且确定包含存储在存储库中的多个元数据的一部分的搜索结果。

    Metadata search based on semantics
    2.
    发明授权
    Metadata search based on semantics 有权
    基于语义的元数据搜索

    公开(公告)号:US09449117B2

    公开(公告)日:2016-09-20

    申请号:US14167424

    申请日:2014-01-29

    IPC分类号: G06F17/30

    摘要: According to some embodiments, a method and an apparatus of enriching search results with metadata are provided to receive a plurality of metadata associated with an entity and storing the plurality of metadata in a repository. A search request associated with the entity is received and search results that comprise a portion of the plurality of metadata stored in the repository are determined.

    摘要翻译: 根据一些实施例,提供了利用元数据来丰富搜索结果的方法和装置,以接收与实体相关联的多个元数据并将多个元数据存储在存储库中。 接收与实体相关联的搜索请求,并且确定包含存储在存储库中的多个元数据的一部分的搜索结果。

    System and method for automatically suggesting rules for data stored in a table

    公开(公告)号:US10332010B2

    公开(公告)日:2019-06-25

    申请号:US13770666

    申请日:2013-02-19

    摘要: A method and system are presented of automatically suggesting rules for data stored in a table, with the table comprising a plurality of columns. The table is profiled to identify a content type for each of one or more of the plurality of columns. A rule knowledge base is accessed to locate rules specified for identified content types. Then, one or more of the located rules specified for identified content types are presented as suggestions. Acceptance of one or more of the suggested rules is received from a user, and the received validations are stored in the rule knowledge base. The accepted rules are applied to data for quality detection and monitoring. Embodiments are also described where columns are suggested based on a given rule.

    Automatic rule generation
    4.
    发明授权
    Automatic rule generation 有权
    自动规则生成

    公开(公告)号:US09152627B2

    公开(公告)日:2015-10-06

    申请号:US13623719

    申请日:2012-09-20

    IPC分类号: G06F17/00 G06F17/30

    摘要: In an example embodiment, a method of automatically generating data validation rules from data stored in a column of a table is provided. Outliers for the data are determined by analyzing a profiling statistic for the data, the profiling statistic having a type. Then it is determined if a predefined limit is exceeded, based on a quantity of the outliers determined for the data through the analysis of the profiling statistic. A data validation rule is then automatically generated based on non-outliers detected in the data through the analysis of the profiling statistic, the generated data validation rule also being based on the type of the profiling statistic. The data validation rule can then be applied to data subsequently entered for the column, causing at least a portion of the data subsequently entered for the column to be rejected.

    摘要翻译: 在示例实施例中,提供了从存储在表的列中的数据自动生成数据验证规则的方法。 通过分析数据的分析统计量,具有类型的分析统计信息来确定数据的异常值。 然后,基于通过分析统计量的分析为数据确定的异常值的数量,确定是否超过预定义的限制。 然后,基于通过分析统计信息的分析在数据中检测到的非离群值来自动生成数据验证规则,生成的数据验证规则也基于分析统计量的类型。 然后可以将数据验证规则应用于随后为列输入的数据,导致随后输入的数据的至少一部分数据被拒绝。

    System and Method for Data Quality Business Impact Analysis
    5.
    发明申请
    System and Method for Data Quality Business Impact Analysis 审中-公开
    数据质量业务影响分析系统与方法

    公开(公告)号:US20140379417A1

    公开(公告)日:2014-12-25

    申请号:US13924152

    申请日:2013-06-21

    IPC分类号: G06Q10/06

    CPC分类号: G06Q10/0635

    摘要: A computer implemented method of calculating a cost impact. The method includes associating cost amounts with various rules, using the rules to identify bad data, and calculating an aggregate cost of the bad data. In this manner, the Data Steward can prioritize various data quality improvement projects.

    摘要翻译: 计算机实现计算成本影响的方法。 该方法包括将成本金额与各种规则相关联,使用规则来识别不良数据,以及计算坏数据的总成本。 以这种方式,数据管家可以优先考虑各种数据质量改进项目。

    Automatic Rule Generation
    6.
    发明申请
    Automatic Rule Generation 有权
    自动规则生成

    公开(公告)号:US20140081931A1

    公开(公告)日:2014-03-20

    申请号:US13623719

    申请日:2012-09-20

    IPC分类号: G06F17/30

    摘要: In an example embodiment, a method of automatically generating data validation rules from data stored in a column of a table is provided. Outliers for the data are determined by analyzing a profiling statistic for the data, the profiling statistic having a type. Then it is determined if a predefined limit is exceeded, based on a quantity of the outliers determined for the data through the analysis of the profiling statistic. A data validation rule is then automatically generated based on non-outliers detected in the data through the analysis of the profiling statistic, the generated data validation rule also being based on the type of the profiling statistic. The data validation rule can then be applied to data subsequently entered for the column, causing at least a portion of the data subsequently entered for the column to be rejected.

    摘要翻译: 在示例实施例中,提供了从存储在表的列中的数据自动生成数据验证规则的方法。 通过分析数据的分析统计量,具有类型的分析统计信息来确定数据的异常值。 然后,基于通过分析统计量的分析为数据确定的异常值的数量,确定是否超过预定义的限制。 然后,基于通过分析统计信息的分析在数据中检测到的非离群值来自动生成数据验证规则,生成的数据验证规则也基于分析统计量的类型。 然后可以将数据验证规则应用于随后为列输入的数据,导致随后输入的数据的至少一部分数据被拒绝。

    Just-in-time data quality assessment for best record creation

    公开(公告)号:US11093521B2

    公开(公告)日:2021-08-17

    申请号:US13929475

    申请日:2013-06-27

    IPC分类号: G06F16/27

    摘要: Systems and methods for just-in-time data quality assessment of best records created during data migration are disclosed. A data steward includes tools for creating and editing a best record creation strategy that defines how records from multiple systems will be integrated into target systems. At design time, the data steward can generate best record creation and validation rules based on the best record creation strategy. The data steward can apply the best record creation and validation rules to a sample of matched records from multiple data sources to generate a sample set of best records. The efficacy of the best record creation rules can be evaluated by assessing the number of fields in the sample set that fail the validation rules. During review, the validation rules can be applied to edits to the best records received from a human reviewer to ensure compliance with the best record creation strategy.

    Just-in-Time Data Quality Assessment for Best Record Creation
    8.
    发明申请
    Just-in-Time Data Quality Assessment for Best Record Creation 审中-公开
    即时数据质量评估最佳记录创建

    公开(公告)号:US20150006491A1

    公开(公告)日:2015-01-01

    申请号:US13929475

    申请日:2013-06-27

    IPC分类号: G06F17/30

    CPC分类号: G06F16/27

    摘要: Systems and methods for just-in-time data quality assessment of best records created during data migration are disclosed. A data steward includes tools for creating and editing a best record creation strategy that defines how records from multiple systems will be integrated into target systems. At design time, the data steward can generate best record creation and validation rules based on the best record creation strategy. The data steward can apply the best record creation and validation rules to a sample of matched records from multiple data sources to generate a sample set of best records. The efficacy of the best record creation rules can be evaluated by assessing the number of fields in the sample set that fail the validation rules. During review, the validation rules can be applied to edits to the best records received from a human reviewer to ensure compliance with the best record creation strategy.

    摘要翻译: 披露了在数据迁移期间创建的最佳记录的即时数据质量评估的系统和方法。 数据管家包括用于创建和编辑最佳记录创建策略的工具,该策略定义如何将来自多个系统的记录集成到目标系统中。 在设计时,数据管理员可以根据最佳记录创建策略生成最佳记录创建和验证规则。 数据管理员可以将最佳记录创建和验证规则应用于来自多个数据源的匹配记录的样本,以生成一组最佳记录。 可以通过评估样本集中失败验证规则的字段数来评估最佳记录创建规则的有效性。 在审查期间,验证规则可以应用于编辑从人类审阅者收到的最佳记录,以确保符合最佳记录创建策略。

    SYSTEM AND METHOD FOR AUTOMATICALLY SUGGESTING RULES FOR DATA STORED IN A TABLE
    9.
    发明申请
    SYSTEM AND METHOD FOR AUTOMATICALLY SUGGESTING RULES FOR DATA STORED IN A TABLE 审中-公开
    用于自动建议用于存储在表中的数据的规则的系统和方法

    公开(公告)号:US20140236880A1

    公开(公告)日:2014-08-21

    申请号:US13770666

    申请日:2013-02-19

    IPC分类号: G06N5/02

    CPC分类号: G06N5/025

    摘要: A method and system are presented of automatically suggesting rules for data stored in a table, with the table comprising a plurality of columns. The table is profiled to identify a content type for each of one or more of the plurality of columns. A rule knowledge base is accessed to locate rules specified for identified content types. Then, one or more of the located rules specified for identified content types are presented as suggestions. Acceptance of one or more of the suggested rules is received from a user, and the received validations are stored in the rule knowledge base. The accepted rules are applied to data for quality detection and monitoring. Embodiments are also described where columns are suggested based on a given rule.

    摘要翻译: 提供了一种方法和系统,其自动地建议存储在表中的数据的规则,其中该表包括多个列。 对表进行剖析以识别多个列中的一个或多个的每一个的内容类型。 访问规则知识库以查找为标识的内容类型指定的规则。 然后,为确定的内容类型指定的一个或多个定位规则作为建议呈现。 从用户接收到一个或多个建议的规则的接受,并且接收到的验证被存储在规则知识库中。 接受的规则适用于质量检测和监测的数据。 还描述了基于给定规则来建议列的实施例。