发明申请
- 专利标题: Anonymization of Unstructured Data
- 专利标题(中): 非结构化数据的匿名化
-
申请号: US12614554申请日: 2009-11-09
-
公开(公告)号: US20110113049A1公开(公告)日: 2011-05-12
- 发明人: Matthew A. Davis , Daniel F. Gruhl
- 申请人: Matthew A. Davis , Daniel F. Gruhl
- 申请人地址: US NY Armonk
- 专利权人: INTERNATIONAL BUSINESS MACHINES CORPORATION
- 当前专利权人: INTERNATIONAL BUSINESS MACHINES CORPORATION
- 当前专利权人地址: US NY Armonk
- 主分类号: G06F17/30
- IPC分类号: G06F17/30
摘要:
A method for anonymization of unstructured data comprises determining structured references in the unstructured data; populating a table with the structured references; anonymizing the structured references in the table using ontological analysis; and rewriting the structured references in the unstructured data with the anonymized structured references from the table to produce anonymized data. A system for anonymizing unstructured data comprises an entity spotting module configured to determine structured references in the unstructured data and populate a table with the determined structured references; an anonymization module configured to anonymizing the structured references in the table using ontological analysis; and a replacement module configured to rewrite the structured references in the unstructured data with the anonymized structured references from the table to produce anonymized data.