TRAINING LANGUAGE MODELS AND PRESERVING PRIVACY
Abstract:
In implementations of systems for training language models and preserving privacy, a computing device implements a privacy system to predict a next word after a last word in a sequence of words by processing input data using a machine learning model trained on training data to predict next words after last words in sequences of words. The training data describes a corpus of text associated with clients and including sensitive samples and non-sensitive samples. The machine learning model is trained by sampling a client of the clients and using a subset of the sensitive samples associated with the client and a subset of the non-sensitive samples associated with the client to update parameters of the machine learning model. The privacy system generates an indication of the next word after the last word in the sequence of words for display in a user interface.
Information query
Patent Agency Ranking
0/0