Invention Publication
- Patent Title: MODEL DISTILLATION METHOD AND RELATED DEVICE
-
Application No.: US18443052Application Date: 2024-02-15
-
Publication No.: US20240185086A1Publication Date: 2024-06-06
- Inventor: Lu HOU , Haoli Bai , Lifeng Shang , Xin Jiang , Li Qian
- Applicant: HUAWEI TECHNOLOGIES CO., LTD.
- Applicant Address: CN Shenzhen
- Assignee: HUAWEI TECHNOLOGIES CO., LTD.
- Current Assignee: HUAWEI TECHNOLOGIES CO., LTD.
- Current Assignee Address: CN Shenzhen
- Priority: CN 2110962700.9 2021.08.20
- Main IPC: G06N3/096
- IPC: G06N3/096 ; G06N3/045

Abstract:
This disclosure relates to the field of artificial intelligence, and provides model distillation methods and apparatuses. In an implementation, a method including: obtaining first input data and second input data from a second computing node, wherein the first input data is output data of the third sub-model, and the second input data is output data processed by the fourth sub-model, processing the first input data by using the first sub-model, to obtain a first intermediate output, processing the second input data by using the second sub-model, to obtain a second intermediate output, wherein the first intermediate output and the second intermediate output are used to determine a first gradient, and distilling the first sub-model based on the first gradient, to obtain an updated first sub-model.
Information query