OPTIMIZATION METHOD AND APPARATUS FOR COMPILING COMPUTATION GRAPH

    公开(公告)号:US20240127027A1

    公开(公告)日:2024-04-18

    申请号:US17992814

    申请日:2022-11-22

    Applicant: ZHEJIANG LAB

    CPC classification number: G06N3/04

    Abstract: Disclosed are an optimization method and apparatus for compiling computation graph. The optimization method includes the following steps: step S1: converting a computation graph into an intermediate representation; step S2: analyzing a dependency relationship; step S3: constructing a work stack; step S4: performing initialization to achieve a nonactivated state; step S5: popping out stack top node elements, and updating an input node set in a current round of iteration; step S6: adding the stack top node elements that depend on step S5 to a stack top position in sequence until the work stack is empty; step S7: implementing an intermediate representation in a fixed node state using a bit vector; and step S8: allocating registers for effective tensor variables contained in nodes of the intermediate representation in the fixed node state.

    NEURAL NETWORK COMPUTING-ORIENTED MODELING METHOD AND APPARATUS FOR DISTRIBUTED DATA ROUTING

    公开(公告)号:US20230353458A1

    公开(公告)日:2023-11-02

    申请号:US17848048

    申请日:2022-06-23

    Applicant: ZHEJIANG LAB

    CPC classification number: H04L41/145 H04L41/16 H04L45/44

    Abstract: The present disclosure provides a neural network computing-oriented modeling method and apparatus for distributed data routing. The method includes the following steps: S1, designing the distributed attribute of a physical tensor: abstracting a mapping relationship between a logic tensor and the physical tensor into three distributed attributes including a broadcast attribute, a scatter attribute and a local reduction attribute; S2, deducing the distributed attribute of an output tensor: specifying the distributed attribute of an input tensor, and then deducing the legal distributed attribute of the output tensor according to the known distributed attribute of the input tensor; and S3, judging, according to the distributed attribute situation, whether an intermediate communication primitive needs to be inserted to obtain the distributed attribute of a local physical tensor. The difficulty of distributed design and development is low, and the development of application of a deep neural network large model is promoted.

Patent Agency Ranking