发明申请
US20130238611A1 Automatically Mining Patterns for Rule Based Data Standardization Systems 审中-公开
基于规则的数据标准化系统自动挖掘模式

Automatically Mining Patterns for Rule Based Data Standardization Systems
摘要:
Methods, computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.
信息查询
0/0