-
公开(公告)号:US20240118897A1
公开(公告)日:2024-04-11
申请号:US18071978
申请日:2022-11-30
Applicant: ZHEJIANG LAB
Inventor: Hongsheng WANG , Guang CHEN , Lingfang ZENG , Aimin PAN
IPC: G06F9/38
CPC classification number: G06F9/3838 , G06F9/3885
Abstract: Disclosed are an instruction execution method and apparatus for graph computation. The method includes the following steps: S1: sending operators of each node in a computational graph used for neural network computation to an operator interpreter; S2: building, by the operator interpreter, instructions in operation; S3: defining an instruction dependency relationship; S4: building an instruction dependency relationship graph; S5: building a topological order of parallel instructions; S6: scheduling the parallel instructions to hardware resources; S7: building shortest schedules for the parallel instructions: the shortest time required to execute the parallel instructions under the condition of limited hardware resources; and S8: releasing the completed instructions.
-
公开(公告)号:US20250124347A1
公开(公告)日:2025-04-17
申请号:US18804240
申请日:2024-08-14
Applicant: ZHEJIANG LAB
Inventor: Fei YANG , Shuang PENG , Ning SUN , Fangyu WANG , Aimin PAN
IPC: G06N20/00
Abstract: A training model allocation method, an apparatus, a computer device, and a storage medium are provided. The method includes: acquiring hierarchy information, calculating parameter information, and a training data set of a to-be-trained model; dividing the to-be-trained model into sub-models according to the hierarchy information, and allocating each of sub-models to machine nodes in a training cluster; dividing each of sub-models into sub-model slices according to the calculating parameter information, and allocating each of sub-model slices to computing processors of the machine nodes in the training cluster; dividing the training data set into training data subsets according to the calculating parameter information, and allocating each of training data subsets to the computing processors in the training cluster; and training the to-be-trained model according to all computing processors, sub-model slices, and training data subsets in the training cluster.
-
3.
公开(公告)号:US20240111586A1
公开(公告)日:2024-04-04
申请号:US18472648
申请日:2023-09-22
Applicant: ZHEJIANG LAB
Inventor: Shiqiang ZHU , Aimin PAN , Feng GAO
IPC: G06F9/50
CPC classification number: G06F9/5027
Abstract: The present disclosure belongs to the field of intelligent computing technologies, and relates to a multi-policy intelligent scheduling methods and apparatuses oriented to heterogeneous computing power. The method includes: step 1, setting an execution policy of a task based on heterogeneity of computing clusters, differences of computing tasks and a user requirement, and constructing a Markov decision process model by adopting a reinforcement learning method combined with the execution policy; step 2, adopting a proximal policy optimization to solve an optimal task scheduling policy of the task input by the user based on the constructed Markov decision process model; step 3, scheduling the task to a corresponding computing cluster for execution based on the optimal task scheduling policy.
-
公开(公告)号:US20240104016A1
公开(公告)日:2024-03-28
申请号:US18071958
申请日:2022-11-30
Applicant: ZHEJIANG LAB
Inventor: Hongsheng WANG , Aimin PAN , Guang CHEN
IPC: G06F12/0802 , G06N3/063
CPC classification number: G06F12/0802 , G06N3/063
Abstract: The disclosure discloses an intermediate representation method for compiling computation graphs, including: step 1: compiling a neural network into a computation graph for neural network computation; step 2: constructing a node for each tensor variable in the computation graph; step 3: associating the node representing the tensor variable in the computation graph to a set of pointers to the tensor variable; step 4: analyzing constraint relationships between the tensor variables in the computation graph; step 5: iteratively constructing a topological graph of the intermediate representation based on the constraint relationships between the tensor variables in the computation graph; and step 6: analyzing the tensor variables with different aliases pointing to a same memory location based on the intermediate representation, and allocating a register for the tensor variables with different aliases. The method optimizes the compilation efficiency of the tensor variables pointing to the same memory location in the computation graph.
-
-
-