Flexible accelerator for sparse tensors in convolutional neural networks
摘要:
A system with a multiplication circuit having a plurality of multipliers is disclosed. Each of the plurality of multipliers is configured to receive a data value and a weight value to generate a product value in a convolution operation of a machine learning application. The system also includes an accumulator configured to receive the product value from each of the plurality of multipliers and a register bank configured to store an output of the convolution operation. The accumulator is further configured to receive a portion of values stored in the register bank and combine the received portion of values with the product values to generate combined values. The register bank is further configured to replace the portion of values with the combined values.
信息查询
0/0