摘要:
According to some embodiments, a method and an apparatus of enriching search results with metadata are provided to receive a plurality of metadata associated with an entity and storing the plurality of metadata in a repository. A search request associated with the entity is received and search results that comprise a portion of the plurality of metadata stored in the repository are determined.
摘要:
According to some embodiments, a method and an apparatus of enriching search results with metadata are provided to receive a plurality of metadata associated with an entity and storing the plurality of metadata in a repository. A search request associated with the entity is received and search results that comprise a portion of the plurality of metadata stored in the repository are determined.
摘要:
A method and system are presented of automatically suggesting rules for data stored in a table, with the table comprising a plurality of columns. The table is profiled to identify a content type for each of one or more of the plurality of columns. A rule knowledge base is accessed to locate rules specified for identified content types. Then, one or more of the located rules specified for identified content types are presented as suggestions. Acceptance of one or more of the suggested rules is received from a user, and the received validations are stored in the rule knowledge base. The accepted rules are applied to data for quality detection and monitoring. Embodiments are also described where columns are suggested based on a given rule.
摘要:
In an example embodiment, a method of automatically generating data validation rules from data stored in a column of a table is provided. Outliers for the data are determined by analyzing a profiling statistic for the data, the profiling statistic having a type. Then it is determined if a predefined limit is exceeded, based on a quantity of the outliers determined for the data through the analysis of the profiling statistic. A data validation rule is then automatically generated based on non-outliers detected in the data through the analysis of the profiling statistic, the generated data validation rule also being based on the type of the profiling statistic. The data validation rule can then be applied to data subsequently entered for the column, causing at least a portion of the data subsequently entered for the column to be rejected.
摘要:
A computer implemented method of calculating a cost impact. The method includes associating cost amounts with various rules, using the rules to identify bad data, and calculating an aggregate cost of the bad data. In this manner, the Data Steward can prioritize various data quality improvement projects.
摘要:
In an example embodiment, a method of automatically generating data validation rules from data stored in a column of a table is provided. Outliers for the data are determined by analyzing a profiling statistic for the data, the profiling statistic having a type. Then it is determined if a predefined limit is exceeded, based on a quantity of the outliers determined for the data through the analysis of the profiling statistic. A data validation rule is then automatically generated based on non-outliers detected in the data through the analysis of the profiling statistic, the generated data validation rule also being based on the type of the profiling statistic. The data validation rule can then be applied to data subsequently entered for the column, causing at least a portion of the data subsequently entered for the column to be rejected.
摘要:
Systems and methods for just-in-time data quality assessment of best records created during data migration are disclosed. A data steward includes tools for creating and editing a best record creation strategy that defines how records from multiple systems will be integrated into target systems. At design time, the data steward can generate best record creation and validation rules based on the best record creation strategy. The data steward can apply the best record creation and validation rules to a sample of matched records from multiple data sources to generate a sample set of best records. The efficacy of the best record creation rules can be evaluated by assessing the number of fields in the sample set that fail the validation rules. During review, the validation rules can be applied to edits to the best records received from a human reviewer to ensure compliance with the best record creation strategy.
摘要:
Systems and methods for just-in-time data quality assessment of best records created during data migration are disclosed. A data steward includes tools for creating and editing a best record creation strategy that defines how records from multiple systems will be integrated into target systems. At design time, the data steward can generate best record creation and validation rules based on the best record creation strategy. The data steward can apply the best record creation and validation rules to a sample of matched records from multiple data sources to generate a sample set of best records. The efficacy of the best record creation rules can be evaluated by assessing the number of fields in the sample set that fail the validation rules. During review, the validation rules can be applied to edits to the best records received from a human reviewer to ensure compliance with the best record creation strategy.
摘要:
A method and system are presented of automatically suggesting rules for data stored in a table, with the table comprising a plurality of columns. The table is profiled to identify a content type for each of one or more of the plurality of columns. A rule knowledge base is accessed to locate rules specified for identified content types. Then, one or more of the located rules specified for identified content types are presented as suggestions. Acceptance of one or more of the suggested rules is received from a user, and the received validations are stored in the rule knowledge base. The accepted rules are applied to data for quality detection and monitoring. Embodiments are also described where columns are suggested based on a given rule.
摘要:
According to particular embodiments, determining paths in a network with asymmetric switches includes receiving a graph representing the network. Each asymmetric switch has defined degree connectivity between one or more pairs of degrees of the asymmetric switch. The graph is transformed to yield a transformed graph that accounts for the asymmetric switches. A routing process is applied to the transformed graph to yield one or more paths through the network.