Graph optimization method and apparatus for neural network computation

    公开(公告)号:US11915135B2

    公开(公告)日:2024-02-27

    申请号:US17950028

    申请日:2022-09-21

    Applicant: ZHEJIANG LAB

    CPC classification number: G06N3/08 G06F18/29 G06N20/00

    Abstract: The disclosure discloses a graph optimization method and apparatus for neural network computation. The graph optimization method includes the following steps: S1: converting a computation graph; S2: allocating a register; S3: defining a route selector for a redefined variable; S4: solving the route selector for the redefined variable; S5: defining a criterion of inserting the route selector for the redefined variable into a node; S6: analyzing a dominating edge set of the node for the redefined variable; S7: inserting the route selector for the redefined variable; and S8: renaming the redefined variable. The disclosure solves the problem of the corresponding route selection on a correct definition of the redefined variable when a node including the redefined variable in a computation graph in the compiling period flows through multiple paths of computation flow, reduces the memory cost and promotes the development of implementation application of a deep neural network model.

    Method and platform for meta-knowledge fine-tuning based on domain-invariant features

    公开(公告)号:US11669741B2

    公开(公告)日:2023-06-06

    申请号:US17674859

    申请日:2022-02-18

    Applicant: ZHEJIANG LAB

    CPC classification number: G06N3/08 G06F40/20

    Abstract: Disclosed is a method for meta-knowledge fine-tuning and platform based on domain-invariant features. According to the method, highly transferable common knowledge, i.e., domain-invariant features, in different data sets of the same kind of tasks is learnt, the common domain features in different domains corresponding to different data sets of the same kind of tasks learnt in the network set are fine-tuned to be quickly adapted to any different domains. According to the present application, the parameter initialization ability and generalization ability of the universal language model of the same kind of tasks are improved, and finally a common compression framework of the universal language model of the same kind of downstream tasks is obtained through fine tuning. In the meta-knowledge fine-tuning network, a loss function of the domain-invariant features is designed in the present application, and domain-independent universal knowledge is learn.

    Meta-knowledge fine tuning method and platform for multi-task language model

    公开(公告)号:US11354499B2

    公开(公告)日:2022-06-07

    申请号:US17531813

    申请日:2021-11-22

    Applicant: ZHEJIANG LAB

    Abstract: Disclosed is a meta-knowledge fine tuning method and platform for a multi-task language model. The method is to obtain highly transferable shared knowledge, that is, meta-knowledge, on different data sets of tasks of the same category, perform interrelation and mutual reinforcement on the learning processes of the tasks of the same category that correspond to different data sets and are in different domains, so as to improve the fine tuning effect of downstream tasks of the same category on data sets of different domains in the application of the language model, and improve the parameter initialization ability and the generalization ability of a general language model for the tasks of the same category.

    Neural network computing-oriented modeling method and apparatus for distributed data routing

    公开(公告)号:US11805025B1

    公开(公告)日:2023-10-31

    申请号:US17848048

    申请日:2022-06-23

    Applicant: ZHEJIANG LAB

    CPC classification number: H04L41/145 H04L41/16 H04L45/44

    Abstract: The present disclosure provides a neural network computing-oriented modeling method and apparatus for distributed data routing. The method includes the following steps: S1, designing the distributed attribute of a physical tensor: abstracting a mapping relationship between a logic tensor and the physical tensor into three distributed attributes including a broadcast attribute, a scatter attribute and a local reduction attribute; S2, deducing the distributed attribute of an output tensor: specifying the distributed attribute of an input tensor, and then deducing the legal distributed attribute of the output tensor according to the known distributed attribute of the input tensor; and S3, judging, according to the distributed attribute situation, whether an intermediate communication primitive needs to be inserted to obtain the distributed attribute of a local physical tensor. The difficulty of distributed design and development is low, and the development of application of a deep neural network large model is promoted.

    Method for distributed type training adaptation and apparatus in deep learning framework and AI accelerator card

    公开(公告)号:US11714995B2

    公开(公告)日:2023-08-01

    申请号:US17739205

    申请日:2022-05-09

    Applicant: ZHEJIANG LAB

    CPC classification number: G06N3/0454 G06F8/36 G06F9/4881 G06F9/545

    Abstract: Disclosed is a method for distributed type training adaptation and apparatus in a deep learning framework and an AI accelerator card. The method includes the following steps: S1: the deep learning framework supports single-card configuration in a newly added AI accelerator card, and sub-steps thereof are as follows: S11: the deep learning framework supports new hardware; S12: the deep learning framework supports a device thread of the new hardware; S13: the deep learning framework supports a memory operation of the new hardware; and S14: the deep learning framework supports an operator kernel function of the new hardware; S2: the deep learning framework supports multi-card configuration in the newly added AI accelerator card; S3: the deep learning framework supports tensor segmentation and multi-card distribution; and S4: the deep learning framework supports multi-card collective communication in the newly added AI accelerator card.

    Method and platform for pre-trained language model automatic compression based on multilevel knowledge distillation

    公开(公告)号:US11501171B2

    公开(公告)日:2022-11-15

    申请号:US17555535

    申请日:2021-12-20

    Applicant: ZHEJIANG LAB

    Abstract: Disclosed are an automatic compression method and platform for a pre-trained language model based on multilevel knowledge distillation. The method includes the following steps: step 1, constructing multilevel knowledge distillation, and distilling a knowledge structure of a large model at three different levels: a self-attention unit, a hidden layer state and an embedded layer; step 2, training a knowledge distillation network of meta-learning to generate a general compression architecture of a plurality of pre-trained language models; and step 3, searching for an optimal compression structure based on an evolutionary algorithm. Firstly, the knowledge distillation based on meta-learning is studied to generate the general compression architecture of the plurality of pre-trained language models; and secondly, on the basis of a trained meta-learning network, the optimal compression structure is searched for via the evolutionary algorithm, so as to obtain an optimal general compression architecture of the pre-trained language model independent of tasks.

    Compression method and platform of pre-training language model based on knowledge distillation

    公开(公告)号:US11341326B2

    公开(公告)日:2022-05-24

    申请号:US17483805

    申请日:2021-09-24

    Applicant: ZHEJIANG LAB

    Abstract: Provided is a method and a platform for compressing a pre-training language model based on knowledge distillation. According to the method, a universal knowledge distillation strategy of feature migration is firstly designed, and in the process of knowledge distillation from the teacher model to the student model, the feature mapping of each layer of the student model is approaching the teacher's features, focusing on the ability of small samples to express features in the intermediate layer of the teacher model, and guiding the student model by using these features; then, a knowledge distillation method based on self-attention cross is constructed; finally, a linear transfer strategy based on Bernoulli probability distribution is designed to gradually complete the knowledge transfer of feature mapping and self-attention distribution from teachers to students.

    Methods and apparatuses for executing tasks, storage mediums, and electronic devices

    公开(公告)号:US12039361B1

    公开(公告)日:2024-07-16

    申请号:US18494002

    申请日:2023-10-25

    Applicant: ZHEJIANG LAB

    CPC classification number: G06F9/48

    Abstract: The present disclosure discloses a method for executing a task. The method includes: a master computing device node in a computing cluster system receives a task code of a to-be-executed task; the master computing device node divides the to-be-executed task into subtasks, and for each of the subtasks, the master computing device node determines operators required to execute the subtask based on the task code; the master computing device node respectively distributes the subtasks to computing nodes in the computing cluster system, such that for each of the computing nodes, the computing node generates an executable task subgraph for the computing node based on the operators required to execute the subtask distributed to the computing node and data transmission relationships between the operators required to execute the subtask distributed to the computing node, and runs the executable task subgraph to execute the to-be-executed task.

Patent Agency Ranking