Compression method and platform of pre-training language model based on knowledge distillation

    公开(公告)号:US11341326B2

    公开(公告)日:2022-05-24

    申请号:US17483805

    申请日:2021-09-24

    Applicant: ZHEJIANG LAB

    Abstract: Provided is a method and a platform for compressing a pre-training language model based on knowledge distillation. According to the method, a universal knowledge distillation strategy of feature migration is firstly designed, and in the process of knowledge distillation from the teacher model to the student model, the feature mapping of each layer of the student model is approaching the teacher's features, focusing on the ability of small samples to express features in the intermediate layer of the teacher model, and guiding the student model by using these features; then, a knowledge distillation method based on self-attention cross is constructed; finally, a linear transfer strategy based on Bernoulli probability distribution is designed to gradually complete the knowledge transfer of feature mapping and self-attention distribution from teachers to students.

    Method for adapting deep learning framework to hardware device based on unified backend engine

    公开(公告)号:US11941532B2

    公开(公告)日:2024-03-26

    申请号:US17726563

    申请日:2022-04-22

    Applicant: ZHEJIANG LAB

    CPC classification number: G06N3/10 G06N3/04

    Abstract: Disclosed is a method for adapting a deep learning framework to a hardware device based on a unified backend engine, which comprises the following steps: S1, adding the unified backend engine to the deep learning framework; S2, adding the unified backend engine to the hardware device; S3, converting a computational graph, wherein the computational graph compiled and generated by the deep learning framework is converted into an intermediate representation of the unified backend engine; S4, compiling the intermediate representation, wherein the unified backend engine compiles the intermediate representation on the hardware device to generate an executable object; S5, running the executable object, wherein the deep learning framework runs the executable object on the hardware device; S6: managing memory of the unified backend engine.

Patent Agency Ranking