Patent search ap:("MICROSOFT TECHNOLOGY LICENSING Page LLC") AND inv:"Yeye He"

31.

发明授权
Collecting and annotating transformation tools for use in generating transformation programs 有权

公开(公告)号：US11809223B2

公开(公告)日：2023-11-07

申请号：US17520926

申请日：2021-11-08

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： Yeye He , Kris Ganjam , Vivek Ravindranath Narasayya , Surajit Chaudhuri , Xu Chu

IPC: G06F16/00 , G06F16/25

CPC classification number: G06F16/258

Abstract: Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data transformations, according to embodiments of the present invention. In one embodiment, a plurality of remote sources is searched to identify candidate transformation tools relevant for performing data transformations. The candidate transformation tools are analyzed to identify tool examples corresponding with each of the candidate transformation tools. For each of the candidate transformation tools, the tool examples are stored in association with the corresponding candidate transformation tool. Based on a comparison of tool examples with example values, a transformation tool is identified as relevant to facilitate transforming example input values to the desired form in which to transform data.

32.

发明授权
Data unification 有权

公开(公告)号：US11714790B2

公开(公告)日：2023-08-01

申请号：US17490908

申请日：2021-09-30

Applicant: Microsoft Technology Licensing, LLC

Inventor： Meiyalagan Balasubramanian , Lengning Liu , Aditya Kuppa , Kirk Hartmann Freiheit , Kalen Wong , Paula Budig Greve , Patrick Clinton Little , Lucas Pritz , Yue Wang , Vivek Ravindranath Narasayya , Katchaguy Areekijseree , Yeye He , Surajit Chaudhuri , Gaurav Ghosh

IPC: G06F16/21 , G06F16/215 , G06F16/2455

CPC classification number: G06F16/215 , G06F16/24556

Abstract: Solutions for data unification include: receiving a data record, the data record comprising a plurality of data fields; selecting, from among the plurality of data fields, a subset of the data fields, the subset of the data fields being fewer in number than the plurality of data fields, wherein selecting the subset of the data fields comprises: applying a first rule to select at least a first one of the data fields within the data record for inclusion in the subset of the data fields; using content of the subset of the data fields, generating a stable identifier (stableID) for the data record; and inserting the stableID into a primary key data field of the data record.

33.

发明授权
Machine-learned predictive models and systems for data preparation recommendations 有权

公开(公告)号：US11488068B2

公开(公告)日：2022-11-01

申请号：US16886155

申请日：2020-05-28

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： Yeye He , Cong Yan

IPC: G06F15/16 , G06F9/54 , H04L29/06 , G06N20/00 , G06N5/02

Abstract: Systems are provided for facilitating the building and use of models used to make data preparation recommendations. The systems identify ground truth from a plurality of notebooks and utilizes the ground truth to generate the corresponding data preparation recommendation models. The data preparation recommendation models are used to predict accurate (e.g., useful and relevant) data preparations steps based on user input and user notebook data. The data preparation computing system generates a recommendation prompt based on output from the data preparation recommendation model that can be viewed and/or selected by the user to be applied to the user's notebook data.

34.

发明授权
Collecting and annotating transformation tools for use in generating transformation programs 有权

公开(公告)号：US11170020B2

公开(公告)日：2021-11-09

申请号：US15343720

申请日：2016-11-04

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： Yeye He , Kris Ganjam , Vivek Ravindranath Narasayya , Surajit Chaudhuri , Xu Chu

IPC: G06F16/00 , G06F16/25

Abstract: Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data transformations, according to embodiments of the present invention. In one embodiment, a plurality of remote sources is searched to identify candidate transformation tools relevant for performing data transformations. The candidate transformation tools are analyzed to identify tool examples corresponding with each of the candidate transformation tools. For each of the candidate transformation tools, the tool examples are stored in association with the corresponding candidate transformation tool. Based on a comparison of tool examples with example values, a transformation tool is identified as relevant to facilitate transforming example input values to the desired form in which to transform data.

35.

发明授权
Generating and ranking transformation programs 有权

公开(公告)号：US11163788B2

公开(公告)日：2021-11-02

申请号：US15343704

申请日：2016-11-04

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： Yeye He , Kris Ganjam , Vivek Ravindranath Narasayya , Surajit Chaudhuri , Xu Chu

IPC: G06F16/00 , G06F16/25 , G06F8/30

Abstract: Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data transformations, according to embodiments of the present invention. In one embodiment, a set of example values is received. An index to identify a plurality of data transformation tools that are relevant to the set of example values is referenced, wherein each of the data transformation tools correspond with one or more tool examples. The data transformation tools are ranked based on an extent of similarity between the set of example values and the tool examples. For data transformation tools associated with the extent of similarity that exceeds a similarity threshold, a transformation program is generated that uses the data transformation tool and a supplemental transformation tool to transform the one or more example input values to the desired form in which to transform data.

36.

发明申请
MACHINE-LEARNED PREDICTIVE MODELS AND SYSTEMS FOR DATA PREPARATION RECOMMENDATIONS 有权

公开(公告)号：US20210319357A1

公开(公告)日：2021-10-14

申请号：US16886155

申请日：2020-05-28

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： Yeye He , Cong Yan

IPC: G06N20/00 , G06N5/02

Abstract: Systems are provided for facilitating the building and use of models used to make data preparation recommendations. The systems identify ground truth from a plurality of notebooks and utilizes the ground truth to generate the corresponding data preparation recommendation models. The data preparation recommendation models are used to predict accurate (e.g., useful and relevant) data preparations steps based on user input and user notebook data. The data preparation computing system generates a recommendation prompt based on output from the data preparation recommendation model that can be viewed and/or selected by the user to be applied to the user's notebook data.

37.

发明授权
Synthesizing mapping relationships using table corpus 有权

公开(公告)号：US10650050B2

公开(公告)日：2020-05-12

申请号：US15480926

申请日：2017-04-06

Applicant: Microsoft Technology Licensing, LLC

Inventor： Yeye He , Yue Wang

IPC: G06F16/21 , G06F16/901 , G06F16/28 , G06F16/22

Abstract: Methods and systems for synthesizing mapping tables using table corpus is provided. A functional dependency between at least two items of an input table is determined. A plurality of two-column tables are extracted from the table corpus. The extracted plurality of two-column tables are synthesized to determine at least one mapping table having a first column having the functional dependency with a second column. A next item of the input table is provided from the determined at least one mapping table.

38.

发明授权
Automated database schema annotation 有权

公开(公告)号：US10452661B2

公开(公告)日：2019-10-22

申请号：US14743510

申请日：2015-06-18

Applicant: Microsoft Technology Licensing, LLC

Inventor： Philip A. Bernstein , Yeye He , Eli Cortez Custodio Vilarinho , Lev Novik

IPC: G06F16/00 , G06F16/2457 , G06F17/24 , G06F16/20

Abstract: Techniques and constructs that improve annotating target columns of a target database by performing automated annotation of the target columns using sources. The techniques include calculating a similarity score between a target column and columns extracted from a table that is included in a source. The similarity score is calculated based at least in part on a similarity between a value in the target column of the target database and a column value of the extracted column from the table and on a similarity between an identity of the target column of the target database and column identities of the extracted columns from the table. In some examples, the techniques calculate similarity scores for one or more extracted columns and annotate the target column based on the similarity scores.

39.

发明授权
Joining semantically-related data using big table corpora 有权

公开(公告)号：US10198471B2

公开(公告)日：2019-02-05

申请号：US14726547

申请日：2015-05-31

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： Yeye He , Kris Kuppuswamy Ganjam , Xu Chu

IPC: G06F17/30

Abstract: Examples of the disclosure enable performing semantic joins using a big table corpus. Pairs of values from at least two data sets are identified. The pairs of values include one value from a first one of the data sets and one value from a second one of the data sets. Statistical co-occurrence scores for the identified pairs of values are determined based on historical co-occurrence data. The determined statistical co-occurrence scores are used for predicting a semantic relationship between the at least two data sets. The predicted semantic relationship is used for joining the at least two data sets.

40.

发明申请
DETERMINING A HIERARCHICAL CONCEPT TREE USING A LARGE CORPUS OF TABLE VALUES 审中-公开

公开(公告)号：US20180357262A1

公开(公告)日：2018-12-13

申请号：US15621767

申请日：2017-06-13

Applicant: Microsoft Technology Licensing, LLC

Inventor： Yeye He , Kris K. Ganjam , Li Keqian

IPC: G06F17/30

CPC classification number: G06F16/2246 , G06F16/2282 , G06F16/282 , G06F16/285 , G06F2216/03

Abstract: This disclosure provides for a system, method, and computer-readable medium for implementing a table corpus processing server that identifies concepts within enterprise domain data. The table corpus processing server is configured to iteratively group values in a table corpus based on co-occurrence statistics to produce a candidate hierarchical tree. The candidate hierarchical tree is then summarized by selecting nodes that can best “describe” the original corpus, which leads to a small tree that often corresponds to desired concept hierarchies. The table corpus processing server employs a parallel dynamic programming approach that allows the disclosed embodiments to scale with amount of enterprise domain data being analyzed.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification