DEEP NEURAL NETWORK (DNN) ACCELERATOR FACILITATING ACTIVATION COMPRESSION
摘要:
A system includes a first memory, a compiler, and a DNN accelerator. The DNN accelerator includes a DMA engine, an acceleration module, and a compute block. The compute block includes a second memory. The compiler may generate a task for transferring activations from the second memory to the first memory. The DMA engine may receive the task and read the activations from the second memory. The acceleration module may compress the activations to generate compressed activation data and write the compressed activation data into the external memory. The acceleration module may also store a size of the compressed activation data in the local memory, which may be used by the DMA engine to read the activation from the first memory to the second memory later. The compressed activation data may include non-zero activations and sparsity bitmaps. The compressed activation data may also include a header or zeropoint marker.
信息查询
0/0