-
公开(公告)号:US20250005453A1
公开(公告)日:2025-01-02
申请号:US18710814
申请日:2022-12-12
Applicant: Google LLC
Inventor: Ehsan Amid , Christopher James Fifty , Manfred Klaus Warmuth , Rohan Anil
IPC: G06N20/00
Abstract: Provided is an approach for knowledge distillation based on exporting Principal Components approximations (e.g., Bregman representations) of one or more layer-wise representations of the teacher model. In particular, the present disclosure provides an extension to the original Bregman PCA formulation by incorporating a mean vector and orthonormalizing the principal directions with respect to the geometry of the local convex function around the mean. This extended formulation allows viewing the learned representation as a dense layer, thus casting the problem as learning the linear coefficients of the compressed examples, as the input to this layer, by the student network. Example empirical data indicates that example implementations of the approach improve performance when compared to typical teacher-student training using soft labels.