Invention Grant
- Patent Title: Repairing data through domain knowledge
-
Application No.: US15288899Application Date: 2016-10-07
-
Publication No.: US10127268B2Publication Date: 2018-11-13
- Inventor: Kris Kuppuswamy Ganjam , Yeye He , Anja Gruenheid
- Applicant: Microsoft Technology Licensing, LLC
- Applicant Address: US WA Redmond
- Assignee: Microsoft Technology Licensing, LLC
- Current Assignee: Microsoft Technology Licensing, LLC
- Current Assignee Address: US WA Redmond
- Agency: Workman Nydegger
- Main IPC: G06F17/00
- IPC: G06F17/00 ; G06F17/30

Abstract:
Correcting data in a dataset. A set of data tokens from a tabular data store are grouped into a plurality of different clusters based on similarity of tokens. A reference cluster is selected from among the plurality of different clusters such that the plurality of clusters includes a reference cluster and one or more other clusters, one or more tokens in the one or more other clusters are transformed. Transforming tokens is performed based on a cost of transforming tokens. The effect on the reference cluster of adding the transformed tokens to the reference cluster is determined. Using this information, a correction for a token in the dataset is identified. The data store is updated to correct the token.
Public/Granted literature
- US20180101561A1 Repairing Data Through Domain Knowledge Public/Granted day:2018-04-12
Information query