-
公开(公告)号:US10942771B2
公开(公告)日:2021-03-09
申请号:US16275984
申请日:2019-02-14
Applicant: TuSimple, Inc.
Inventor: Yifan Gong , Siyuan Liu , Dinghua Li , Jiangming Jin , Lei Su , YiXin Yang , Wei Liu , Zehua Huang
IPC: G06F9/48 , G06F16/901 , G06F9/54
Abstract: The present disclosure provides a method, an apparatus and a system for multi-module scheduling, capable of solving at least one of the problems associated with the multi-module scheduling technique in the related art, i.e., inconsistency in data inputted to a computing module, and a significant delay or low throughput in data transmission between computing modules. The method includes: reading, by a master process, a pre-stored configuration file storing a directed computation graph; initializing, by the master process, states of the nodes and connecting edges in a current computing period; determining a node to be called based on the computation direction of the directed computation graph and the states of the nodes, the node to be called comprising a node having all of its input edges in a complete state; transmitting, to the computing module in the slave process corresponding to the node to be called, a call request of Remote Process Call (RPC) to execute the computing module; updating the state of the node and the state of each output edge of the node upon receiving a response to the call request; and proceeding with a next computing period after determining that the states of all the nodes in the directed computation graph have been updated.
-
公开(公告)号:US11055144B2
公开(公告)日:2021-07-06
申请号:US16276084
申请日:2019-02-14
Applicant: TuSimple, Inc.
Inventor: Yifan Gong , Siyuan Liu , Dinghua Li , Jiangming Jin , Lei Su , Yixin Yang , Wei Liu , Zehua Huang
Abstract: The present disclosure provides a method, an apparatus and a system for multi-module scheduling, capable of solving the problem associated with inconsistency in data inputted to a computing module in the multi-module scheduling technique in the related art. The method includes: reading, by a master process, a pre-stored configuration file storing a directed computation graph; initializing, by the master process, states of all the nodes and connecting edges in the directed computation graph initially in computation in a current computing period; determining a node to be called based on the computation direction in the directed computation graph and the states of the nodes, the node to be called comprising a node having all of its input edges in a complete state; transmitting, to the computing module in the slave process corresponding to the node to be called, a call request of Remote Process Call (RPC) to execute the computing module; updating the state of the node and the state of each output edge of the node upon receiving a response to the call request; and proceeding with a next computing period upon determining that the states of all the nodes have been updated.
-
公开(公告)号:US12079722B2
公开(公告)日:2024-09-03
申请号:US18162871
申请日:2023-02-01
Applicant: TuSimple, Inc. , Beijing Tusen Zhitu Technology Co., Ltd.
Inventor: Yuwei Hu , Jiangming Jin , Lei Su , Dinghua Li
CPC classification number: G06N3/08 , G06F12/0207 , G06F17/153 , G06F17/16 , G06N3/045 , G06N3/063 , G06N20/10 , H03M7/30
Abstract: The embodiments of this application provide a method and device for optimizing neural network. The method includes: binarizing and bit-packing input data of a convolution layer along a channel direction, and obtaining compressed input data; binarizing and bit-packing respectively each convolution kernel of the convolution layer along the channel direction, and obtaining each corresponding compressed convolution kernel; dividing the compressed input data sequentially in a convolutional computation order into blocks of the compressed input data with the same size of each compressed convolution kernel, wherein the data input to one time convolutional computation form a data block; and, taking a convolutional computation on each block of the compressed input data and each compressed convolution kernel sequentially, obtaining each convolutional result data, and obtaining multiple output data of the convolution layer according to each convolutional result data.
-
公开(公告)号:US11580377B2
公开(公告)日:2023-02-14
申请号:US16014869
申请日:2018-06-21
Applicant: TuSimple, Inc. , Beijing Tusen Zhitu Technology Co., Ltd.
Inventor: Yuwei Hu , Jiangming Jin , Lei Su , Dinghua Li
Abstract: The embodiments of this application provide a method and device for optimizing neural network. The method includes: binarizing and bit-packing input data of a convolution layer along a channel direction, and obtaining compressed input data; binarizing and bit-packing respectively each convolution kernel of the convolution layer along the channel direction, and obtaining each corresponding compressed convolution kernel; dividing the compressed input data sequentially in a convolutional computation order into blocks of the compressed input data with the same size of each compressed convolution kernel, wherein the data input to one time convolutional computation form a data block; and, taking a convolutional computation on each block of the compressed input data and each compressed convolution kernel sequentially, obtaining each convolutional result data, and obtaining multiple output data of the convolution layer according to each convolutional result data.
-
-
-