Patent search ap:("Microsoft Technology Licensing Page LLC") AND inv:"Tianhao Lu"

1.

发明授权
Secure storage and processing of data for generating training data 有权

公开(公告)号：US12197539B2

公开(公告)日：2025-01-14

申请号：US17169161

申请日：2021-02-05

Applicant: Microsoft Technology Licensing, LLC

Inventor： Yunpeng Xu , Tianhao Lu , Xiaoqiang Luo , Jiashuo Wang , Chencheng Wu

IPC: G06F18/21 , G06F18/214 , G06N20/00

Abstract: Techniques for securely storing and processing data for training data generation are provided. In one technique, multiple encrypted records are retrieved from a first persistent storage. For each encrypted record, that record is decrypted in memory to generate a decrypted record that comprises multiple attribute values. Then, based on the attribute values and a definition of multiple features of a machine-learned model, multiple feature values are generated and stored, along with a label, in a training instance, which is then stored in a second persistent storage. One or more machine learning techniques are used to train the machine-learned model based on training data that includes the training instances that are stored in the second persistent storage.

2.

发明授权
Identifying duplicate entities 有权

公开(公告)号：US11436532B2

公开(公告)日：2022-09-06

申请号：US16703386

申请日：2019-12-04

Applicant: Microsoft Technology Licensing, LLC

Inventor： Tianhao Lu , Junzhe Miao , Yunpeng Xu , Dan Shacham , Hong H. Tam , Tao Xiong

IPC: G06N20/00 , G06F16/174 , G06F16/23 , G06F16/953

Abstract: The disclosed embodiments provide a system that identifies duplicate entities. During operation, the system selects training data for a first machine learning model based on confidence scores representing likelihoods that pairs of entities in an online system are duplicates. Next, the system updates parameters of the first machine learning model based on features and labels in the training data. The system then identifies a first subset of additional pairs of the entities as duplicate entities based on scores generated by the first machine learning model from values of the features for the additional pairs and a first threshold associated with the scores. The system also determines a canonical entity in each of the duplicate entities based on additional features. Finally, the system updates content outputted in a user interface of the online system based on the identified first subset of the additional pairs.

3.

发明申请
IDENTIFYING DUPLICATE ENTITIES 有权

公开(公告)号：US20210173825A1

公开(公告)日：2021-06-10

申请号：US16703386

申请日：2019-12-04

Applicant: Microsoft Technology Licensing, LLC

Inventor： Tianhao Lu , Junzhe Miao , Yunpeng Xu , Dan Shacham , Hong H. Tam , Tao Xiong

IPC: G06F16/23 , G06N20/00

Abstract: The disclosed embodiments provide a system that identifies duplicate entities. During operation, the system selects training data for a first machine learning model based on confidence scores representing likelihoods that pairs of entities in an online system are duplicates. Next, the system updates parameters of the first machine learning model based on features and labels in the training data. The system then identifies a first subset of additional pairs of the entities as duplicate entities based on scores generated by the first machine learning model from values of the features for the additional pairs and a first threshold associated with the scores. The system also determines a canonical entity in each of the duplicate entities based on additional features. Finally, the system updates content outputted in a user interface of the online system based on the identified first subset of the additional pairs.

Patent Agency Ranking