Patent search ap:("Gwangju Institute of Science AND Technology") AND inv:"Hee Jun JUNG" Page 1

1.

发明公开
KNOWLEDGE DISTILLATION METHOD FOR COMPRESSING TRANSFORMER NEURAL NETWORK AND APPARATUS THEREOF 审中-公开

公开(公告)号：US20240330648A1

公开(公告)日：2024-10-03

申请号：US18596994

申请日：2024-03-06

Applicant: Gwangju Institute of Science and Technology

Inventor： Hee Jun JUNG , Kang Il KIM , Do Yeon KIM

IPC: G06N3/042 , G06N3/082 , G06N3/096

CPC classification number: G06N3/042 , G06N3/082 , G06N3/096

Abstract: A method for training a student network including at least one or more of a transformer neural network by using knowledge distillation in a teacher network including at least one or more of the transformer neural network is disclosed. The method includes: pre-training the teacher network using a training data and fine tuning the trained teacher network; copying a weight parameter of a bottom layer of the teacher network to the student network; and performing the knowledge distillation to the student network through the fine-tuned teacher network. The performing the knowledge distillation includes: extracting a feature structure from the result value of a layer of the fine-tuned teacher network; extracting a feature structure from the result value of a layer of the student network; and adjusting the feature structure of the extracted student network based on the feature structure of the extracted teacher network.

Patent Agency Ranking