Patent search ap:("Microsoft Technology Licensing Page LLC") AND inv:"Kris Kuppuswamy Ganjam"

1.

发明授权
Repairing data through domain knowledge 有权

公开(公告)号：US10127268B2

公开(公告)日：2018-11-13

申请号：US15288899

申请日：2016-10-07

Applicant: Microsoft Technology Licensing, LLC

Inventor： Kris Kuppuswamy Ganjam , Yeye He , Anja Gruenheid

IPC: G06F17/00 , G06F17/30

Abstract: Correcting data in a dataset. A set of data tokens from a tabular data store are grouped into a plurality of different clusters based on similarity of tokens. A reference cluster is selected from among the plurality of different clusters such that the plurality of clusters includes a reference cluster and one or more other clusters, one or more tokens in the one or more other clusters are transformed. Transforming tokens is performed based on a cost of transforming tokens. The effect on the reference cluster of adding the transformed tokens to the reference cluster is determined. Using this information, a correction for a token in the dataset is identified. The data store is updated to correct the token.

2.

发明授权
Joining semantically-related data using big table corpora 有权

公开(公告)号：US10198471B2

公开(公告)日：2019-02-05

申请号：US14726547

申请日：2015-05-31

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： Yeye He , Kris Kuppuswamy Ganjam , Xu Chu

IPC: G06F17/30

Abstract: Examples of the disclosure enable performing semantic joins using a big table corpus. Pairs of values from at least two data sets are identified. The pairs of values include one value from a first one of the data sets and one value from a second one of the data sets. Statistical co-occurrence scores for the identified pairs of values are determined based on historical co-occurrence data. The determined statistical co-occurrence scores are used for predicting a semantic relationship between the at least two data sets. The predicted semantic relationship is used for joining the at least two data sets.

3.

发明授权
Repairing data through domain knowledge 有权

公开(公告)号：US10970271B2

公开(公告)日：2021-04-06

申请号：US16161695

申请日：2018-10-16

Applicant: Microsoft Technology Licensing, LLC

Inventor： Kris Kuppuswamy Ganjam , Yeye He , Anja Gruenheid

IPC: G06F16/23 , G06F16/215 , G06F16/28 , G06F16/35 , G06F16/2457

Abstract: Correcting data in a dataset. A set of data tokens from a tabular data store are grouped into a plurality of different clusters based on similarity of tokens. A reference cluster is selected from among the plurality of different clusters such that the plurality of clusters includes a reference cluster and one or more other clusters. One or more tokens in the one or more other clusters are transformed. The effect on the reference cluster of adding the transformed tokens to the reference cluster is determined. Using this information, a correction for a token in the dataset is identified. The data store is updated to correct the token using the identified correction.

4.

发明申请
Repairing Data Through Domain Knowledge 审中-公开

公开(公告)号：US20180101561A1

公开(公告)日：2018-04-12

申请号：US15288899

申请日：2016-10-07

Applicant: Microsoft Technology Licensing, LLC

Inventor： Kris Kuppuswamy Ganjam , Yeye He , Anja Gruenheid

IPC: G06F17/30

CPC classification number: G06F17/30371 , G06F17/30303 , G06F17/3053 , G06F17/30598 , G06F17/3071

Abstract: Correcting data in a dataset. A set of data tokens from a tabular data store are grouped into a plurality of different clusters based on similarity of tokens. A reference cluster is selected from among the plurality of different clusters such that the plurality of clusters includes a reference cluster and one or more other clusters, one or more tokens in the one or more other clusters are transformed. Transforming tokens is performed based on a cost of transforming tokens. The effect on the reference cluster of adding the transformed tokens to the reference cluster is determined. Using this information, a correction for a token in the dataset is identified. The data store is updated to correct the token.

Patent Agency Ranking