TRUST-REGION AWARE NEURAL NETWORK ARCHITECTURE SEARCH FOR KNOWLEDGE DISTILLATION
Abstract:
A processor-implemented method of searching for a neural network architecture includes defining a search space of student neural network architectures for knowledge distillation. The search space includes multiple convolutional operators and multiple transformer operators. A trust-region Bayesian optimization is performed to select a student neural network architecture from the search space based on a pre-defined teacher model.
Information query
Patent Agency Ranking
0/0