KNOWLEDGE DISTILLATION USING DEEP CLUSTERING
摘要:
Methods and systems for training a neural network include clustering a full set of training data samples into specialized training clusters. Specialized teacher neural networks are trained using respective specialized training clusters of the specialized training clusters. Soft labels are generated for the full set of training data samples using the specialized teacher neural networks. A student model is trained using the full set of training data samples, the specialized training clusters, and the soft labels.
信息查询
0/0