-
公开(公告)号:US11915135B2
公开(公告)日:2024-02-27
申请号:US17950028
申请日:2022-09-21
Applicant: ZHEJIANG LAB
Inventor: Hongsheng Wang , Guang Chen
Abstract: The disclosure discloses a graph optimization method and apparatus for neural network computation. The graph optimization method includes the following steps: S1: converting a computation graph; S2: allocating a register; S3: defining a route selector for a redefined variable; S4: solving the route selector for the redefined variable; S5: defining a criterion of inserting the route selector for the redefined variable into a node; S6: analyzing a dominating edge set of the node for the redefined variable; S7: inserting the route selector for the redefined variable; and S8: renaming the redefined variable. The disclosure solves the problem of the corresponding route selection on a correct definition of the redefined variable when a node including the redefined variable in a computation graph in the compiling period flows through multiple paths of computation flow, reduces the memory cost and promotes the development of implementation application of a deep neural network model.
-
公开(公告)号:US11699290B1
公开(公告)日:2023-07-11
申请号:US17954129
申请日:2022-09-27
Applicant: ZHEJIANG LAB
Inventor: Hongsheng Wang , Guang Chen
CPC classification number: G06V20/53 , G06T5/009 , G06T7/73 , G06T7/90 , G06V10/774 , G06V10/82 , G06V20/41 , G06V20/46 , G06V40/10 , G06T2207/10016 , G06T2207/10024 , G06T2207/20081 , G06T2207/20084 , G06T2207/30196 , G06T2207/30232
Abstract: Disclosed are a pedestrian re-identification method and apparatus based on local feature attention. The method includes the following steps: S1: obtaining an original surveillance video image data set, and dividing the original surveillance video image data set into a training set and a test set in proportion; and S2: performing image enhancement on the original surveillance video image training set to obtain enhanced images, and converting the enhanced images into sequence data. The pedestrian re-identification technology based on local feature attention uses a multi-head attention mechanism neural network to capture, extract video image feature sequences and replace convolution kernels in a convolutional neural network, uses fully connected layers and an activation function to combine local pedestrian feature sequences into complete pedestrian feature sequences through a weight matrix, performs prediction on the obtained pedestrian feature sequences, outputs position coordinates of pedestrians in the images and selects pedestrians to realize pedestrian re-identification.
-
公开(公告)号:US11669741B2
公开(公告)日:2023-06-06
申请号:US17674859
申请日:2022-02-18
Applicant: ZHEJIANG LAB
Inventor: Hongsheng Wang , Haijun Shan , Shengjian Hu
Abstract: Disclosed is a method for meta-knowledge fine-tuning and platform based on domain-invariant features. According to the method, highly transferable common knowledge, i.e., domain-invariant features, in different data sets of the same kind of tasks is learnt, the common domain features in different domains corresponding to different data sets of the same kind of tasks learnt in the network set are fine-tuned to be quickly adapted to any different domains. According to the present application, the parameter initialization ability and generalization ability of the universal language model of the same kind of tasks are improved, and finally a common compression framework of the universal language model of the same kind of downstream tasks is obtained through fine tuning. In the meta-knowledge fine-tuning network, a loss function of the domain-invariant features is designed in the present application, and domain-independent universal knowledge is learn.
-
公开(公告)号:US11354499B2
公开(公告)日:2022-06-07
申请号:US17531813
申请日:2021-11-22
Applicant: ZHEJIANG LAB
Inventor: Hongsheng Wang , Haijun Shan , Shengjian Hu
Abstract: Disclosed is a meta-knowledge fine tuning method and platform for a multi-task language model. The method is to obtain highly transferable shared knowledge, that is, meta-knowledge, on different data sets of tasks of the same category, perform interrelation and mutual reinforcement on the learning processes of the tasks of the same category that correspond to different data sets and are in different domains, so as to improve the fine tuning effect of downstream tasks of the same category on data sets of different domains in the application of the language model, and improve the parameter initialization ability and the generalization ability of a general language model for the tasks of the same category.
-
5.
公开(公告)号:US11805025B1
公开(公告)日:2023-10-31
申请号:US17848048
申请日:2022-06-23
Applicant: ZHEJIANG LAB
Inventor: Hongsheng Wang , Shuibing He , Hujun Bao , Guang Chen
CPC classification number: H04L41/145 , H04L41/16 , H04L45/44
Abstract: The present disclosure provides a neural network computing-oriented modeling method and apparatus for distributed data routing. The method includes the following steps: S1, designing the distributed attribute of a physical tensor: abstracting a mapping relationship between a logic tensor and the physical tensor into three distributed attributes including a broadcast attribute, a scatter attribute and a local reduction attribute; S2, deducing the distributed attribute of an output tensor: specifying the distributed attribute of an input tensor, and then deducing the legal distributed attribute of the output tensor according to the known distributed attribute of the input tensor; and S3, judging, according to the distributed attribute situation, whether an intermediate communication primitive needs to be inserted to obtain the distributed attribute of a local physical tensor. The difficulty of distributed design and development is low, and the development of application of a deep neural network large model is promoted.
-
公开(公告)号:US11714995B2
公开(公告)日:2023-08-01
申请号:US17739205
申请日:2022-05-09
Applicant: ZHEJIANG LAB
Inventor: Hongsheng Wang , Hujun Bao , Wei Hua , Weiqiang Jia
CPC classification number: G06N3/0454 , G06F8/36 , G06F9/4881 , G06F9/545
Abstract: Disclosed is a method for distributed type training adaptation and apparatus in a deep learning framework and an AI accelerator card. The method includes the following steps: S1: the deep learning framework supports single-card configuration in a newly added AI accelerator card, and sub-steps thereof are as follows: S11: the deep learning framework supports new hardware; S12: the deep learning framework supports a device thread of the new hardware; S13: the deep learning framework supports a memory operation of the new hardware; and S14: the deep learning framework supports an operator kernel function of the new hardware; S2: the deep learning framework supports multi-card configuration in the newly added AI accelerator card; S3: the deep learning framework supports tensor segmentation and multi-card distribution; and S4: the deep learning framework supports multi-card collective communication in the newly added AI accelerator card.
-
公开(公告)号:US11501171B2
公开(公告)日:2022-11-15
申请号:US17555535
申请日:2021-12-20
Applicant: ZHEJIANG LAB
Inventor: Hongsheng Wang , Enping Wang , Zailiang Yu
Abstract: Disclosed are an automatic compression method and platform for a pre-trained language model based on multilevel knowledge distillation. The method includes the following steps: step 1, constructing multilevel knowledge distillation, and distilling a knowledge structure of a large model at three different levels: a self-attention unit, a hidden layer state and an embedded layer; step 2, training a knowledge distillation network of meta-learning to generate a general compression architecture of a plurality of pre-trained language models; and step 3, searching for an optimal compression structure based on an evolutionary algorithm. Firstly, the knowledge distillation based on meta-learning is studied to generate the general compression architecture of the plurality of pre-trained language models; and secondly, on the basis of a trained meta-learning network, the optimal compression structure is searched for via the evolutionary algorithm, so as to obtain an optimal general compression architecture of the pre-trained language model independent of tasks.
-
8.
公开(公告)号:US11341326B2
公开(公告)日:2022-05-24
申请号:US17483805
申请日:2021-09-24
Applicant: ZHEJIANG LAB
Inventor: Hongsheng Wang , Haijun Shan , Fei Yang
Abstract: Provided is a method and a platform for compressing a pre-training language model based on knowledge distillation. According to the method, a universal knowledge distillation strategy of feature migration is firstly designed, and in the process of knowledge distillation from the teacher model to the student model, the feature mapping of each layer of the student model is approaching the teacher's features, focusing on the ability of small samples to express features in the intermediate layer of the teacher model, and guiding the student model by using these features; then, a knowledge distillation method based on self-attention cross is constructed; finally, a linear transfer strategy based on Bernoulli probability distribution is designed to gradually complete the knowledge transfer of feature mapping and self-attention distribution from teachers to students.
-
公开(公告)号:US12039361B1
公开(公告)日:2024-07-16
申请号:US18494002
申请日:2023-10-25
Applicant: ZHEJIANG LAB
Inventor: Hongsheng Wang , Guang Chen , Fei Wu , Feng Lin
IPC: G06F9/48
CPC classification number: G06F9/48
Abstract: The present disclosure discloses a method for executing a task. The method includes: a master computing device node in a computing cluster system receives a task code of a to-be-executed task; the master computing device node divides the to-be-executed task into subtasks, and for each of the subtasks, the master computing device node determines operators required to execute the subtask based on the task code; the master computing device node respectively distributes the subtasks to computing nodes in the computing cluster system, such that for each of the computing nodes, the computing node generates an executable task subgraph for the computing node based on the operators required to execute the subtask distributed to the computing node and data transmission relationships between the operators required to execute the subtask distributed to the computing node, and runs the executable task subgraph to execute the to-be-executed task.
-
10.
公开(公告)号:US11941514B2
公开(公告)日:2024-03-26
申请号:US17706734
申请日:2022-03-29
Applicant: ZHEJIANG LAB
Inventor: Hongsheng Wang , Hujun Bao , Guang Chen , Lingfang Zeng , Hongcai Cheng , Yong Li , Jian Zhu , Huanbo Zheng
Abstract: The present disclosure discloses a method for execution of a computational graph in a neural network model and an apparatus thereof, including: creating task execution bodies on a native machine according to a physical computational graph compiled and generated by a deep learning framework, and designing a solution for allocating a plurality of idle memory blocks to each task execution body, so that the entire computational graph participates in deep learning training tasks of different batches of data in a pipelining and parallelizing manner.
-
-
-
-
-
-
-
-
-