-
公开(公告)号:US11080272B2
公开(公告)日:2021-08-03
申请号:US16457666
申请日:2019-06-28
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yang Chen , Liang Zhang , Haifeng Zhao , Jiashuo Wang , Aparna Krishnan , Anand Kishore , Chencheng Wu , John P. Moore
IPC: G06F16/242 , G06F16/25 , G06F16/2457
Abstract: Entity resolution techniques for matching entity records from different data sources are provided. In one technique, an entity record from a source database is identified along with multiple data items included therein. Each data item corresponds to an attribute of multiple source attributes. For one of the data items that corresponds to a first source attribute, multiple target attributes are identified. A first query is generated that includes the data items and associates the data item with each of the multiple target attributes. A second query that is different than the first query is also generated. Two searches are performed of a target database: one based on the first query and the other based on the second query. A scoring model generates multiple scores, one for each search result. It is determined whether the entity record matches an entity record in the target database based on the set of scores.
-
公开(公告)号:US12197539B2
公开(公告)日:2025-01-14
申请号:US17169161
申请日:2021-02-05
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yunpeng Xu , Tianhao Lu , Xiaoqiang Luo , Jiashuo Wang , Chencheng Wu
IPC: G06F18/21 , G06F18/214 , G06N20/00
Abstract: Techniques for securely storing and processing data for training data generation are provided. In one technique, multiple encrypted records are retrieved from a first persistent storage. For each encrypted record, that record is decrypted in memory to generate a decrypted record that comprises multiple attribute values. Then, based on the attribute values and a definition of multiple features of a machine-learned model, multiple feature values are generated and stored, along with a label, in a training instance, which is then stored in a second persistent storage. One or more machine learning techniques are used to train the machine-learned model based on training data that includes the training instances that are stored in the second persistent storage.
-
公开(公告)号:US11792167B2
公开(公告)日:2023-10-17
申请号:US17219482
申请日:2021-03-31
Applicant: Microsoft Technology Licensing, LLC
Inventor: Haifeng Zhao , Yang Chen , Jiashuo Wang , Xiaojing Chen , Chencheng Wu , Souvik Ghosh , Ankit Gupta , Jing Wang , John Patrick Moore , Henry Heyburn Pistell , Mira Thambireddy , Haowen Cao , Keyi Yu
CPC classification number: H04L63/0428 , G06N20/00
Abstract: Techniques for a flexible data security and machine learning system for merging third-party data are provided. In one technique, the system receives a data set from a third-party entity and receives selection data that indicates that the third-party entity selected a set of data security policies that includes an encryption option and a data mixing option from among multiple data mixing options. In response to receiving the selection data, the system stores data that associates the set of data security policies with the data set, encrypts the data set according to the encryption option, and persistently stores the encrypted data set. Later, the system decrypts the encrypted data set in volatile memory, generates, based on the data mixing option, training data based on the decrypted version of the data set, trains a machine-learned model based on the training data, and stores the machine-learned model in association with the data set.
-
-