Invention Application
- Patent Title: Repairing Data Through Domain Knowledge
-
Application No.: US15288899Application Date: 2016-10-07
-
Publication No.: US20180101561A1Publication Date: 2018-04-12
- Inventor: Kris Kuppuswamy Ganjam , Yeye He , Anja Gruenheid
- Applicant: Microsoft Technology Licensing, LLC
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
Correcting data in a dataset. A set of data tokens from a tabular data store are grouped into a plurality of different clusters based on similarity of tokens. A reference cluster is selected from among the plurality of different clusters such that the plurality of clusters includes a reference cluster and one or more other clusters, one or more tokens in the one or more other clusters are transformed. Transforming tokens is performed based on a cost of transforming tokens. The effect on the reference cluster of adding the transformed tokens to the reference cluster is determined. Using this information, a correction for a token in the dataset is identified. The data store is updated to correct the token.
Public/Granted literature
- US10127268B2 Repairing data through domain knowledge Public/Granted day:2018-11-13
Information query