Patent search ap:("Microsoft Technology Licensing Page LLC") AND inv:"Weizhu CHEN"

1.

发明申请
REDUCING BIASES OF GENERATIVE LANGUAGE MODELS 有权

公开(公告)号：US20220392434A1

公开(公告)日：2022-12-08

申请号：US17342490

申请日：2021-06-08

Applicant: Microsoft Technology Licensing, LLC

Inventor： Abedelkader ASI , Yarin KUPER , Royi RONEN , Song WANG , Olga GOLDENBERG , Shimrit Rada BEMIS , Erez ALTUS , Yi MAO , Weizhu CHEN

IPC: G10L15/06 , G06N20/00

Abstract: The disclosure herein describes reducing training bias in outputs generated by a generative language model. A communication segment associated with a communication is obtained by at least one processor of a generative language model. An output value associated with the communication segment is generated by the generative language model. The output value is mapped to a set of training bias values associated with the generative language model and based on the mapping of the output value to a training bias value of the set of training bias values, an alternative output value is generated. The alternative output value is used in a generated segment output for the communication segment. The accuracy of segment outputs generated by the generative language model is improved through reducing or eliminating its training biases.

2.

发明公开
Interacting with a Language Model using External Knowledge and Feedback 审中-公开

公开(公告)号：US20240362418A1

公开(公告)日：2024-10-31

申请号：US18140658

申请日：2023-04-28

Applicant: Microsoft Technology Licensing, LLC

Inventor： Baolin PENG , Michel GALLEY , Hao CHENG , Pengcheng HE , Nguyen Hung BACH , Weizhu CHEN , Jianfeng GAO

IPC: G06F40/40 , G06F16/332

CPC classification number: G06F40/40 , G06F16/3325

Abstract: A technique supplements a language model with knowledge information retrieved from external sources. The technique operates by: receiving a query; receiving knowledge information based on the query; generating original model-input information that includes the query and the knowledge information; and presenting the original model-input information to the language model. The technique further includes: receiving an original response from the language model; generating a usefulness measure that identifies usefulness of the original response; and determining whether the usefulness measure satisfies a prescribed test. Upon determining that the usefulness measure does not satisfy the test, the technique includes: generating revised model-input information that includes feedback information; presenting the revised model-input information to the language model; and receiving a revised response from the language model. According to some implementations, the technique eliminates or reduces artificial hallucination exhibited by the language model.

3.

发明公开
GENERATION OF DATA MODELS FOR PREDICTING DATA 审中-公开

公开(公告)号：US20240046037A1

公开(公告)日：2024-02-08

申请号：US18268699

申请日：2020-12-25

Applicant: Microsoft Technology Licensing, LLC

Inventor： Jian JIAO , Yeyun GONG , Nan DUAN , Weizhu CHEN , Kewen TANG , Qiang LOU , Ruofei ZHANG , Yu YAN , Jiusheng CHEN

IPC: G06F40/284 , G06F40/40

CPC classification number: G06F40/284 , G06F40/40

Abstract: Systems and methods are provided for training a data model based on training data. The training includes pre-training and fine-tuning the data model based on a combination of an autoregressive (AR) model and a non-autoregressive (NAR) model. Training data may be received and encoded into streams of tokens. A pre-trainer during decoding generates a continuum of data structures of the AR and NAR combined model including a main stream and a series of predicting streams. Masked tokens in predicting streams reference or attend to one or more preceding tokens in the main stream or the preceding predicting streams. A fine-tuner selects streams to generate a trained model according to a target data model. The target data model is determined based on balancing an accuracy constraint and an efficiency constraint for predicting tokens. The decoder acts as abridge between the AR and NAR models in generating a trained data model.

4.

发明公开
LANGUAGE-MODEL PRETRAINING WITH GRADIENT-DISENTANGLED EMBEDDING SHARING 审中-公开

公开(公告)号：US20230153532A1

公开(公告)日：2023-05-18

申请号：US17664031

申请日：2022-05-18

Applicant: Microsoft Technology Licensing, LLC

Inventor： Pengcheng HE , Jianfeng GAO , Weizhu CHEN

IPC: G06F40/284 , G06F40/295 , G06N3/08 , G06N5/04

CPC classification number: G06F40/284 , G06F40/295 , G06N3/08 , G06N5/04

Abstract: A method for training a language model comprises (a) receiving vectorized training data as input to a multitask pretraining problem; (b) generating modified vectorized training data based on the vectorized training data, according to an upstream data embedding; (c) emitting pretraining output based on the modified vectorized training data, according to a downstream data embedding equivalent to the upstream data embedding; and (d) adjusting the upstream data embedding and the downstream data embedding by computing, based on the pretraining output, a gradient of the upstream data embedding disentangled from a gradient of the downstream data embedding, thereby advancing the multitask pretraining problem toward a pretrained state.

5.

发明申请
GENERATING MODEL TRAINING DATA FROM A DOMAIN SPECIFICATION 有权

公开(公告)号：US20230119613A1

公开(公告)日：2023-04-20

申请号：US17505531

申请日：2021-10-19

Applicant: Microsoft Technology Licensing, LLC

Inventor： Zeqi LIN , Yu HU , Haiyuan CAO , Yi LIU , Jian-Guang LOU , Kuralmani ELANGO , PalaniRaj KALIYAPERUMAL , Weizhu CHEN , Kunal MUKERJEE

IPC: G06F40/35 , G06F40/211 , G06F40/186 , G06N20/00

Abstract: Examples described herein generate training data for machine learning (ML) for natural language (NL) processing (such as semantic parsing for translating NL). A formula tree is generated based on sampling both a formula grammar and NL templates. Using the formula tree, an ML training data instance pair is generated comprising a formula example and an NL example. A context example may also be used during instantiation of the formula tree. An ML model is trained with training data including the ML training data instance pair, and ML output is generated from NL input. The ML output includes, for example, a machine-interpretable formula, a database querying language command, or a general programming language instruction. Some examples support context-free grammar, probabilistic context-free grammar, and/or non-context-free production rules.

6.

发明申请
Automatically Labeling Items using a Machine-Trained Language Model 有权

公开(公告)号：US20250139380A1

公开(公告)日：2025-05-01

申请号：US18385358

申请日：2023-10-30

Applicant: Microsoft Technology Licensing, LLC

Inventor： Daniel Arthur SOMMERFIELD , Weizhu CHEN , Adarsh RAMANATHAN

IPC: G06F40/40

Abstract: A computer-implemented labeling technique generates a task description that describes a labeling task to be given to a language model. The technique then sends a prompt to the language model, which includes the task description and a particular item to be labeled. The technique receives a response provided by the language model in response to the prompt, which specifies a class assigned by the language model to the item. In some implementations, the task description specifies a group of suggested classes to be used in classifying the particular item. The task description also invites the language model to specify another class upon a finding that none of the group of suggested classes applies to the item. The technique also allows a user to stop and restart a labeling run at any point in the labeling run. Other aspects of the technique include consensus processing and weight updating.

7.

发明公开
MULTI-TASK MACHINE LEARNING ARCHITECTURES AND TRAINING PROCEDURES 审中-公开

公开(公告)号：US20240346295A1

公开(公告)日：2024-10-17

申请号：US18654691

申请日：2024-05-03

Applicant: Microsoft Technology Licensing, LLC

Inventor： Weizhu CHEN , Pengcheng HE , Xiaodong LIU , Jianfeng GAO

IPC: G06N3/047 , G06F40/20 , G06N3/045 , G06N3/088

CPC classification number: G06N3/047 , G06F40/20 , G06N3/045 , G06N3/088

Abstract: This document relates to architectures and training procedures for multi-task machine learning models, such as neural networks. One example method involves providing a multi-task machine learning model having one or more shared layers and two or more task-specific layers. The method can also involve performing a pretraining stage on the one or more shared layers using one or more unsupervised prediction tasks. The method can also involve performing a tuning stage on the one or more shared layers and the two or more task-specific layers using respective task-specific objectives

8.

发明申请
ADVERSARIAL TRAINING OF MACHINE LEARNING MODELS 有权

公开(公告)号：US20210142181A1

公开(公告)日：2021-05-13

申请号：US16775635

申请日：2020-01-29

Applicant: Microsoft Technology Licensing, LLC

Inventor： Xiaodong LIU , Jianfeng GAO , Pengcheng HE , Weizhu CHEN

IPC: G06N3/08 , G06N3/04

Abstract: This document relates to training of machine learning models such as neural networks. One example method involves providing a machine learning model having one or more layers and associated parameters and performing a pretraining stage on the parameters of the machine learning model to obtain pretrained parameters. The example method also involves performing a tuning stage on the machine learning model by using labeled training samples to tune the pretrained parameters. The tuning stage can include performing noise adjustment of the labeled training examples to obtain noise-adjusted training samples. The tuning stage can also include adjusting the pretrained parameters based at least on the labeled training examples and the noise-adjusted training examples to obtain adapted parameters. The example method can also include outputting a tuned machine learning model having the adapted parameters.

9.

发明申请
ITERATIVE SAMPLING BASED DATASET CLUSTERING 有权

公开(公告)号：US20210117448A1

公开(公告)日：2021-04-22

申请号：US16659017

申请日：2019-10-21

Applicant: Microsoft Technology Licensing, LLC

Inventor： Shean WANG , Jiayuan HUANG , Weizhu CHEN , Changhong YUAN , Ankit SARAF , Xiaoying GUO , Eslam K. ABDELREHEEM , Yunjing MA , Yuantao WANG , Justin Carl WONG , Nan ZHAO , Chao LI , Tsuyoshi WATANABE , Jaclyn Ruth Elizabeth PHILLIPS

IPC: G06F16/28

Abstract: In some examples, iterative sampling based dataset clustering may include sampling a dataset that includes a plurality of items to identify a specified number of sampled items. The sampled items may be clustered to generate a plurality of clusters. Un-sampled items may be assigned from the plurality of items to the clusters. Remaining un-sampled items that are not assigned to the clusters may be identified. A ratio associated with the remaining un-sampled items and the plurality of items may be compared to a specified threshold. Based on a determination that the ratio is greater than the specified threshold, an indication of completion of clustering of the plurality of items may be generated.

10.

发明申请
ADVERSARIAL TRAINING OF MACHINE LEARNING MODELS 有权

公开(公告)号：US20250165792A1

公开(公告)日：2025-05-22

申请号：US19034250

申请日：2025-01-22

Applicant: Microsoft Technology Licensing, LLC

Inventor： Xiaodong LIU , Jianfeng GAO , Pengcheng HE , Weizhu CHEN

IPC: G06N3/088 , G06N3/045

Abstract: This document relates to training of machine learning models such as neural networks. One example method involves providing a machine learning model having one or more layers and associated parameters and performing a pretraining stage on the parameters of the machine learning model to obtain pretrained parameters. The example method also involves performing a tuning stage on the machine learning model by using labeled training samples to tune the pretrained parameters. The tuning stage can include performing noise adjustment of the labeled training examples to obtain noise-adjusted training samples. The tuning stage can also include adjusting the pretrained parameters based at least on the labeled training examples and the noise-adjusted training examples to obtain adapted parameters. The example method can also include outputting a tuned machine learning model having the adapted parameters.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification