-
公开(公告)号:US20200097605A1
公开(公告)日:2020-03-26
申请号:US16141853
申请日:2018-09-25
Applicant: Microsoft Technology Licensing, LLC
Inventor: Jingyuan Liu , Xiaoqiang Luo , Tzu Ming Kuo , Marcello Oliva , Yunpeng Xu
Abstract: A system and method are provided for automatic identification, extraction, and validation of data pertaining to receiving entity events (REE). Feature (or attribute) values associated with web content are identified. The web content may contain news and features on current/past affairs. The identified feature values are considered by a rule-based or a machine-learned model and, based upon output of the model, a determination as to whether the set of data comprises a REE is made. If the determination is positive, then multiple data items are extracted from the set of data and, optionally, from other data from the source.
-
公开(公告)号:US20250077792A1
公开(公告)日:2025-03-06
申请号:US18459290
申请日:2023-08-31
Applicant: Microsoft Technology Licensing, LLC
Inventor: Xilun Chen , Tzu Ming Kuo , Xiaoqiang Luo , Ilya Dan Melamed , Ji Yan , Peide Zhong
Abstract: Embodiments of the disclosed technologies are capable of a training pipeline to fine-tune a machine learning model given a limited set of domain-specific data. The embodiments describe using a first machine learning model to generate a pseudo label associated with a domain-specific training document. The pseudo label comprises a machine-generated text of a content type extracted from the domain-specific training document. The embodiments further describe fine-tuning a second machine learning model using the pseudo label, the domain-specific training document, a first low-rank weight matrix, and a second low-rank weight matrix. The fine-tuned second machine learning model generates text of the content type from a domain-specific document.
-
公开(公告)号:US12197539B2
公开(公告)日:2025-01-14
申请号:US17169161
申请日:2021-02-05
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yunpeng Xu , Tianhao Lu , Xiaoqiang Luo , Jiashuo Wang , Chencheng Wu
IPC: G06F18/21 , G06F18/214 , G06N20/00
Abstract: Techniques for securely storing and processing data for training data generation are provided. In one technique, multiple encrypted records are retrieved from a first persistent storage. For each encrypted record, that record is decrypted in memory to generate a decrypted record that comprises multiple attribute values. Then, based on the attribute values and a definition of multiple features of a machine-learned model, multiple feature values are generated and stored, along with a label, in a training instance, which is then stored in a second persistent storage. One or more machine learning techniques are used to train the machine-learned model based on training data that includes the training instances that are stored in the second persistent storage.
-
公开(公告)号:US20190197176A1
公开(公告)日:2019-06-27
申请号:US15851142
申请日:2017-12-21
Applicant: Microsoft Technology Licensing, LLC
Inventor: Xiaoqiang Luo , Yunpeng Xu , Marcello Oliva
CPC classification number: G06F16/285 , G06N5/022 , G06N20/00 , H04L63/102
Abstract: Techniques for identifying relationships between entities using machine learning are disclosed herein. In some embodiments, a computer-implemented method comprises: ingesting natural language text comprising a first target entity and a second target entity; identifying a relationship between the first target entity and the second target entity using at least one model; and performing a function using the identified relationship between the first target entity and the second target entity based on the identifying of the relationship, the function comprising a database modification operation or a relationship verification operation, the database modification operation comprising modifying at least one of a graph, a corresponding profile of the first target entity, and a corresponding profile of the second target entity stored in the database of the online service to indicate the identified relationship, and the relationship verification operation comprising causing the identified relationship to be displayed on a computing device.
-
-
-