Data anonymization for data labeling and development purposes
Abstract:
A method and system are disclosed for anonymizing data for labeling and development purposes. A data storage backend has a database of non-anonymous data that is received from a data source. An anonymization engine of the data storage backend generates anonymized data by removing personally identifiable information from the non-anonymous data. These anonymized data are made available to human labelers who manually provide labels based on the anonymized data using a data labeling tool. These labels are then stored in association with the corresponding non-anonymous data, which can then be used for training one or more machine learning models. In this way, non-anonymous data having personally identifiable information can be manually labelled for development purposes without exposing the personally identifiable information to any human labelers.
Public/Granted literature
Information query
Patent Agency Ranking
0/0