-
公开(公告)号:US20210303725A1
公开(公告)日:2021-09-30
申请号:US17216891
申请日:2021-03-30
Applicant: Google LLC
Inventor: Tzvika Hartman , ltay Laish , Jiyang Chen , Kaveh Ketabchi , Gavin Bee , Rohit Talreja , Yossi Matias , Andrew Max
Abstract: Apparatus and methods related to de-identifying data are provided. An example method includes receiving, by a computing device, input data comprising text. The method further includes applying a neural network to a tokenized representation of the input text, to generate an embedding based on contextual information associated with an entity. The method also includes predicting, by the neural network and based on the embedding, whether the input data comprises protected data in the text, wherein the neural network has been trained on a training dataset that has been partially customized based on the entity. The method further includes de-identifying the protected data in the text upon a determination that the input data comprises protected data in the text.