-
公开(公告)号:US11775803B2
公开(公告)日:2023-10-03
申请号:US18012938
申请日:2021-04-26
发明人: Haiwei Liu , Gang Dong , Yaqian Zhao , Rengang Li , Dongdong Jiang , Hongbin Yang , Lingyan Liang
IPC分类号: G06F12/10 , G06N3/08 , G06N3/0442 , G06F17/16
CPC分类号: G06N3/0442 , G06F17/16
摘要: A system for accelerating an RNN network including: a first cache, which is used for outputting Wx1 to WxN or Wh1 to WhN in parallel in N paths in a cyclic switching manner, and the degree of parallelism is k; a second cache, which is used for outputting xt or ht-1 in the cyclic switching manner; a vector multiplication circuit, which is used for, by using N groups of multiplication arrays, respectively calculating Wx1xt to WxNxt, or respectively calculating Wh1ht-1 to WhNht-1; an addition circuit, which is used for calculating Wx1xt+Wh1ht-1+b1 to WxNxt+WhNht-1+bN; an activation circuit, which is used for performing an activation operation according to an output of the addition circuit; a state updating circuit, which is used for acquiring ct-1, calculating ct and ht, updating ct-1, and sending ht to the second cache; a bias data cache; a vector cache; and a cell state cache.
-
2.
公开(公告)号:US11748970B2
公开(公告)日:2023-09-05
申请号:US17794110
申请日:2020-11-16
发明人: Qichun Cao , Yaqian Zhao , Gang Dong , Lingyan Liang , Wenfeng Yin
CPC分类号: G06V10/28
摘要: A hardware environment-based data quantization method includes: parsing a model file under a current deep learning framework to obtain intermediate computational graph data and weight data that are independent of a hardware environment; performing calculation on image data in an input data set through a process indicated by an intermediate computational graph to obtain feature map data; separately performing uniform quantization on the weight data and the feature map data of each layer according to a preset linear quantization method, and calculating a weight quantization factor and a feature map quantization factor (S103); combining the weight quantization factor and the feature map quantization factor to obtain a quantization parameter that makes hardware use shift instead of division; and finally, writing the quantization parameter and the quantized weight data to a bin file according to a hardware requirement so as to generate quantized file data (S105).
-
公开(公告)号:US12045729B2
公开(公告)日:2024-07-23
申请号:US18005620
申请日:2021-01-25
发明人: Wenfeng Yin , Gang Dong , Yaqian Zhao , Qichun Cao , Lingyan Liang , Haiwei Liu , Hongbin Yang
IPC分类号: G06N3/09 , G06N3/0895 , G06N3/0985
CPC分类号: G06N3/0985 , G06N3/0895
摘要: A neural network compression method whereby forward inference is performed on target data by using a target parameter sharing network to obtain an output feature map of the last convolutional module, a channel related feature is extracted from the output feature map, the extracted channel related feature and a target constraint condition are input into a target meta-generative network, and an optimal network architecture under the target constraint condition is predicted by using the target meta-generative network to obtain a compressed neural network model.
-
-