Patent search ap:("MICROSOFT TECHNOLOGY LICENSING Page LLC") AND inv:"Yeye He"

1.

发明申请
MACHINE-LEARNED PREDICTIVE MODELS AND SYSTEMS FOR DATA PREPARATION RECOMMENDATIONS 有权

公开(公告)号：US20230043015A1

公开(公告)日：2023-02-09

申请号：US17969377

申请日：2022-10-19

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： Yeye He , Cong Yan

IPC: G06N20/00 , G06N5/02

Abstract: Systems are provided for facilitating the building and use of models used to make data preparation recommendations. The systems identify ground truth from a plurality of notebooks and utilizes the ground truth to generate the corresponding data preparation recommendation models. The data preparation recommendation models are used to predict accurate (e.g., useful and relevant) data preparations steps based on user input and user notebook data. The data preparation computing system generates a recommendation prompt based on output from the data preparation recommendation model that can be viewed and/or selected by the user to be applied to the user's notebook data.

2.

发明授权
Facilitating data type detection using existing code 有权

公开(公告)号：US10795667B2

公开(公告)日：2020-10-06

申请号：US15850283

申请日：2017-12-21

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： Yeye He , Cong Yan

IPC: G06F8/70 , G06F16/23 , G06F16/2457 , G06F8/30

Abstract: Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data type detection, according to embodiments of the present invention. In one embodiment, existing code is searched to identify a set of functions related to a target data type. Such functions can be executed using positive example values and negative example values. For each executed function, a logical explanation is generated that represents a distinction in execution of the positive example values from the negative example values. The executed functions can then be ranked based on the extent to which the corresponding logical explanations distinguish execution of the positive example values from the negative example values. A function suggestion corresponding with at least a highest ranked function can then be provided, for example to a user, to indicate a function for use in detecting the target data type.

3.

发明授权
Determining a hierarchical concept tree using a large corpus of table values 有权

公开(公告)号：US10789229B2

公开(公告)日：2020-09-29

申请号：US15621767

申请日：2017-06-13

Applicant: Microsoft Technology Licensing, LLC

Inventor： Yeye He , Kris K. Ganjam , Keqian Li

IPC: G06F16/00 , G06F16/22 , G06F16/28

Abstract: A table corpus processing server identifies concepts within enterprise domain data. The table corpus processing server is configured to iteratively group values in a table corpus based on co-occurrence statistics to produce a candidate hierarchical tree. The candidate hierarchical tree is then summarized by selecting nodes that can best “describe” the original corpus, which leads to a small tree that often corresponds to desired concept hierarchies. The table corpus processing server employs a parallel dynamic programming approach that allows the disclosed embodiments to scale with amount of enterprise domain data being analyzed.

4.

发明授权
Concept expansion using tables 有权

公开(公告)号：US10769140B2

公开(公告)日：2020-09-08

申请号：US14754318

申请日：2015-06-29

Applicant: Microsoft Technology Licensing, LLC

Inventor： Philip A. Bernstein , Kaushik Chakrabarti , Zhimin Chen , Yeye He , Chi Wang , Kris K. Ganjam

IPC: G06F16/245 , G06F16/901

Abstract: Concept expansion using tables, such as web tables, can return entities belonging to a concept based on an input of the concept and at least one seed entity that belongs to the concept. A concept expansion frontend can receive the concept and seed entity and provide them to a concept expansion framework. The concept expansion framework can expand the coverage of entities for concepts, including tail concepts, using tables by leveraging rich content signals corresponding to concept names. Such content signals can include content matching the concept that appear in captions, early headings, page titles, surrounding text, anchor text, and queries for which the page has been clicked. The concept expansion framework can use the structured entities in tables to infer exclusive tables. Such inference differs from previous label propagation methods and involves modeling a table-entity relationship. The table-entity relationship reduces semantic drift without using a reference ontology.

5.

发明授权
Extensible data transformations 有权

公开(公告)号：US10706066B2

公开(公告)日：2020-07-07

申请号：US15295858

申请日：2016-10-17

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： Kris Ganjam , Yeye He , Vivek Ravindranath Narasayya , Surajit Chaudhuri

IPC: G06F16/25 , G06F21/60

Abstract: Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data transformations, according to embodiments of the present invention. In one embodiment, a set of example values are received. A repository of transformation tools is searched to identify a new transformation tool as relevant to a data transformation associated with the received set of example values. The repository includes annotations associated with the new transformation tool. The new transformation tool is used to generate a transformation program that produces transformed output values. Additional annotations are generated for the new transformation tool based on the transformed output values.

6.

发明申请
DISCOVERING SCHEMA USING ANCHOR ATTRIBUTES 审中-公开

公开(公告)号：US20190325046A1

公开(公告)日：2019-10-24

申请号：US15957378

申请日：2018-04-19

Applicant: Microsoft Technology Licensing, LLC

Inventor： Lev Novik , Surajit Chaudhuri , Yeye He

IPC: G06F17/30

Abstract: Systems, methods, and computer-executable instructions for partitioning a data set include receiving anchor attributes of a data set. The data set includes records, with each record including attributes. A set of filter attributes that are not mutually exclusive with any of the anchor attributes is determined. A set of candidate attributes that includes each unique attribute from the first data set excluding the anchor attributes and the filter attributes is determined. For each of the anchor attributes and the anchor attributes, an attribute context is determined. For each of the candidate attributes, a context similarity between each of the anchor attributes is determined. A new anchor attribute is selected from the set of candidate attributes based on the context similarity.

7.

发明申请
FACILITATING DATA TRANSFORMATIONS 审中-公开

公开(公告)号：US20180081954A1

公开(公告)日：2018-03-22

申请号：US15271154

申请日：2016-09-20

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： Yeye He , Kris Ganjam , Vivek Ravindranath Narasayya , Surajit Chaudhuri

IPC: G06F17/30

CPC classification number: G06F16/258 , G06F16/245

Abstract: Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data transformations, according to embodiments of the present invention. In one embodiment, a set of example values including example input values that indicate data values to be transformed and example output values that indicate a desired form in which to transform data. Based on the set of example values, a data transformation function that is relevant to the set of example values is identified. The data transformation function is used to generate a transformation program to transform the example input values to the desired form in which to transform data. A suggestion of the transformation program can be provided to a user device, wherein selection of the transformation program suggestion results in a data transformation.

8.

发明授权
Machine-learned predictive models and systems for data preparation recommendations 有权

公开(公告)号：US11928564B2

公开(公告)日：2024-03-12

申请号：US17969377

申请日：2022-10-19

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： Yeye He , Cong Yan

IPC: G06F15/16 , G06F9/54 , G06N5/02 , G06N20/00 , H04L29/06

CPC classification number: G06N20/00 , G06N5/02

Abstract: Systems are provided for facilitating the building and use of models used to make data preparation recommendations. The systems identify ground truth from a plurality of notebooks and utilizes the ground truth to generate the corresponding data preparation recommendation models. The data preparation recommendation models are used to predict accurate (e.g., useful and relevant) data preparations steps based on user input and user notebook data. The data preparation computing system generates a recommendation prompt based on output from the data preparation recommendation model that can be viewed and/or selected by the user to be applied to the user's notebook data.

9.

发明授权
Leveraging a collection of training tables to accurately predict errors within a variety of tables 有权

公开(公告)号：US11698892B2

公开(公告)日：2023-07-11

申请号：US17510327

申请日：2021-10-25

Applicant: Microsoft Technology Licensing, LLC

Inventor： Yeye He , Pei Wang

IPC: G06F16/00 , G06F16/22 , G06F16/215 , G06N20/00 , G06F17/18

CPC classification number: G06F16/2282 , G06F16/215 , G06F17/18 , G06N20/00

Abstract: The present disclosure relates to systems, methods, and computer-readable media for using a variety of hypothesis tests to identify errors within tables and other structured datasets. For example, systems disclosed herein can generate a modified table from an input table by removing one or more entries from the input table. The systems disclosed herein can further leverage a collection of training tables to determine probabilities associated with whether the input table and modified table are drawn from the collection of training tables. The systems disclosed herein can additionally compare the probabilities to accurately determine whether the one or more entries include errors therein. The systems disclosed herein may apply to a variety of different sizes and types of tables to identify different types of common errors within input tables.

10.

发明授权
Repairing data through domain knowledge 有权

公开(公告)号：US10970271B2

公开(公告)日：2021-04-06

申请号：US16161695

申请日：2018-10-16

Applicant: Microsoft Technology Licensing, LLC

Inventor： Kris Kuppuswamy Ganjam , Yeye He , Anja Gruenheid

IPC: G06F16/23 , G06F16/215 , G06F16/28 , G06F16/35 , G06F16/2457

Abstract: Correcting data in a dataset. A set of data tokens from a tabular data store are grouped into a plurality of different clusters based on similarity of tokens. A reference cluster is selected from among the plurality of different clusters such that the plurality of clusters includes a reference cluster and one or more other clusters. One or more tokens in the one or more other clusters are transformed. The effect on the reference cluster of adding the transformed tokens to the reference cluster is determined. Using this information, a correction for a token in the dataset is identified. The data store is updated to correct the token using the identified correction.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification