Privacy-preserving training corpus selection
Abstract:
The present disclosure relates to training a speech recognition system. A system that includes an automated speech recognizer and receives data from a client device. The system determines that at least a portion of the received data is likely sensitive data. Before the at least a portion of the received data is deleted, the system provides the at least a portion of the received data to a model training engine that trains recognition models for the automated speech recognizer. After the at least a portion of the received data is provided, the system deletes the at least a portion of the received data.
Public/Granted literature
Information query
Patent Agency Ranking
0/0