Invention Publication
- Patent Title: TRAINING LANGUAGE MODELS AND PRESERVING PRIVACY
-
Application No.: US18173199Application Date: 2023-02-23
-
Publication No.: US20240135103A1Publication Date: 2024-04-25
- Inventor: Franck Dernoncourt , Tong Sun , Thi kim phung Lai , Rajiv Bhawanji Jain , Nikolaos Barmpalios , Jiuxiang Gu
- Applicant: Adobe Inc.
- Applicant Address: US CA San Jose
- Assignee: Adobe Inc.
- Current Assignee: Adobe Inc.
- Current Assignee Address: US CA San Jose
- Main IPC: G06F40/295
- IPC: G06F40/295 ; G06F40/274

Abstract:
In implementations of systems for training language models and preserving privacy, a computing device implements a privacy system to predict a next word after a last word in a sequence of words by processing input data using a machine learning model trained on training data to predict next words after last words in sequences of words. The training data describes a corpus of text associated with clients and including sensitive samples and non-sensitive samples. The machine learning model is trained by sampling a client of the clients and using a subset of the sensitive samples associated with the client and a subset of the non-sensitive samples associated with the client to update parameters of the machine learning model. The privacy system generates an indication of the next word after the last word in the sequence of words for display in a user interface.
Information query