-
公开(公告)号:US20240185086A1
公开(公告)日:2024-06-06
申请号:US18443052
申请日:2024-02-15
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Lu HOU , Haoli Bai , Lifeng Shang , Xin Jiang , Li Qian
Abstract: This disclosure relates to the field of artificial intelligence, and provides model distillation methods and apparatuses. In an implementation, a method including: obtaining first input data and second input data from a second computing node, wherein the first input data is output data of the third sub-model, and the second input data is output data processed by the fourth sub-model, processing the first input data by using the first sub-model, to obtain a first intermediate output, processing the second input data by using the second sub-model, to obtain a second intermediate output, wherein the first intermediate output and the second intermediate output are used to determine a first gradient, and distilling the first sub-model based on the first gradient, to obtain an updated first sub-model.