Knowledge Distillation Via Learning to Predict Principal Components Coefficients

    公开(公告)号:US20250005453A1

    公开(公告)日:2025-01-02

    申请号:US18710814

    申请日:2022-12-12

    Applicant: Google LLC

    Abstract: Provided is an approach for knowledge distillation based on exporting Principal Components approximations (e.g., Bregman representations) of one or more layer-wise representations of the teacher model. In particular, the present disclosure provides an extension to the original Bregman PCA formulation by incorporating a mean vector and orthonormalizing the principal directions with respect to the geometry of the local convex function around the mean. This extended formulation allows viewing the learned representation as a dense layer, thus casting the problem as learning the linear coefficients of the compressed examples, as the input to this layer, by the student network. Example empirical data indicates that example implementations of the approach improve performance when compared to typical teacher-student training using soft labels.

Patent Agency Ranking