-
公开(公告)号:US20240127027A1
公开(公告)日:2024-04-18
申请号:US17992814
申请日:2022-11-22
Applicant: ZHEJIANG LAB
Inventor: Hongsheng WANG , Shuibing HE , Guang CHEN
IPC: G06N3/04
CPC classification number: G06N3/04
Abstract: Disclosed are an optimization method and apparatus for compiling computation graph. The optimization method includes the following steps: step S1: converting a computation graph into an intermediate representation; step S2: analyzing a dependency relationship; step S3: constructing a work stack; step S4: performing initialization to achieve a nonactivated state; step S5: popping out stack top node elements, and updating an input node set in a current round of iteration; step S6: adding the stack top node elements that depend on step S5 to a stack top position in sequence until the work stack is empty; step S7: implementing an intermediate representation in a fixed node state using a bit vector; and step S8: allocating registers for effective tensor variables contained in nodes of the intermediate representation in the fixed node state.
-
2.
公开(公告)号:US20230353458A1
公开(公告)日:2023-11-02
申请号:US17848048
申请日:2022-06-23
Applicant: ZHEJIANG LAB
Inventor: Hongsheng WANG , Shuibing HE , Hujun BAO , Guang CHEN
CPC classification number: H04L41/145 , H04L41/16 , H04L45/44
Abstract: The present disclosure provides a neural network computing-oriented modeling method and apparatus for distributed data routing. The method includes the following steps: S1, designing the distributed attribute of a physical tensor: abstracting a mapping relationship between a logic tensor and the physical tensor into three distributed attributes including a broadcast attribute, a scatter attribute and a local reduction attribute; S2, deducing the distributed attribute of an output tensor: specifying the distributed attribute of an input tensor, and then deducing the legal distributed attribute of the output tensor according to the known distributed attribute of the input tensor; and S3, judging, according to the distributed attribute situation, whether an intermediate communication primitive needs to be inserted to obtain the distributed attribute of a local physical tensor. The difficulty of distributed design and development is low, and the development of application of a deep neural network large model is promoted.
-