Method and platform for pre-trained language model automatic compression based on multilevel knowledge distillation

Invention Grant

US11501171B2 Method and platform for pre-trained language model automatic compression based on multilevel knowledge distillation 有权

Please log in to see more content

Patent Title: Method and platform for pre-trained language model automatic compression based on multilevel knowledge distillation
Application No.: US17555535

Application Date: 2021-12-20
Publication No.: US11501171B2

Publication Date: 2022-11-15
Inventor: Hongsheng Wang , Enping Wang , Zailiang Yu
Applicant: ZHEJIANG LAB
Applicant Address: CN Hangzhou
Assignee: ZHEJIANG LAB
Current Assignee: ZHEJIANG LAB
Current Assignee Address: CN Hangzhou
Agency: W&G Law Group
Main IPC: G06N3/08
IPC: G06N3/08 ; G06F40/40

Method and platform for pre-trained language model automatic compression based on multilevel knowledge distillation

Abstract:

Disclosed are an automatic compression method and platform for a pre-trained language model based on multilevel knowledge distillation. The method includes the following steps: step 1, constructing multilevel knowledge distillation, and distilling a knowledge structure of a large model at three different levels: a self-attention unit, a hidden layer state and an embedded layer; step 2, training a knowledge distillation network of meta-learning to generate a general compression architecture of a plurality of pre-trained language models; and step 3, searching for an optimal compression structure based on an evolutionary algorithm. Firstly, the knowledge distillation based on meta-learning is studied to generate the general compression architecture of the plurality of pre-trained language models; and secondly, on the basis of a trained meta-learning network, the optimal compression structure is searched for via the evolutionary algorithm, so as to obtain an optimal general compression architecture of the pre-trained language model independent of tasks.

Public/Granted literature

US20220198276A1 METHOD AND PLATFORM FOR PRE-TRAINED LANGUAGE MODEL AUTOMATIC COMPRESSION BASED ON MULTILEVEL KNOWLEDGE DISTILLATION Public/Granted day:2022-06-23

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N3/00	基于生物学模型的计算机系统
G06N3/02	.采用神经网络模型
G06N3/08	..学习方法