TRAINING DISTILLED MACHINE LEARNING MODELS USING A PRE-TRAINED FEATURE EXTRACTOR
Abstract:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a student machine learning model using a teacher machine learning model that has a pre-trained feature extractor. In one aspect, a method includes obtaining data specifying the teacher machine learning model that is configured to perform a machine learning task; obtaining first training data; training the teacher machine learning model on the first training data to obtain a trained teacher machine learning model; generating second, automatically labeled training data by using the trained teacher machine learning model to process unlabeled training data; and training a student machine learning model to perform the machine learning task using at least the second, automatically labeled training data, wherein the student machine learning model does not include the pre-trained feature extractor and instead includes a different feature extractor having fewer parameters than the pre-trained feature extractor.
Information query
Patent Agency Ranking
0/0