NEURAL NETWORK PROCESSOR USING COMPRESSION AND DECOMPRESSION OF ACTIVATION DATA TO REDUCE MEMORY BANDWIDTH UTILIZATION

发明申请

US20180300606A1 NEURAL NETWORK PROCESSOR USING COMPRESSION AND DECOMPRESSION OF ACTIVATION DATA TO REDUCE MEMORY BANDWIDTH UTILIZATION 审中-公开

请登陆查看更多内容

专利标题： NEURAL NETWORK PROCESSOR USING COMPRESSION AND DECOMPRESSION OF ACTIVATION DATA TO REDUCE MEMORY BANDWIDTH UTILIZATION
申请号： US15953356

申请日： 2018-04-13
公开(公告)号： US20180300606A1

公开(公告)日： 2018-10-18
发明人: Joseph Leon CORKERY , Benjamin Eliot LUNDELL , Larry Marvin WALL , Chad Balling McBRIDE , Amol Ashok AMBARDEKAR , George PETRE , Kent D. CEDOLA , Boris BOBROV
申请人： Microsoft Technology Licensing, LLC
主分类号： G06N3/04
IPC分类号： G06N3/04 ; G06N3/063 ; H03M7/30

NEURAL NETWORK PROCESSOR USING COMPRESSION AND DECOMPRESSION OF ACTIVATION DATA TO REDUCE MEMORY BANDWIDTH UTILIZATION

摘要：

A deep neural network (“DNN”) module can compress and decompress neuron-generated activation data to reduce the utilization of memory bus bandwidth. The compression unit can receive an uncompressed chunk of data generated by a neuron in the DNN module. The compression unit generates a mask portion and a data portion of a compressed output chunk. The mask portion encodes the presence and location of the zero and non-zero bytes in the uncompressed chunk of data. The data portion stores truncated non-zero bytes from the uncompressed chunk of data. A decompression unit can receive a compressed chunk of data from memory in the DNN processor or memory of an application host. The decompression unit decompresses the compressed chunk of data using the mask portion and the data portion. This can reduce memory bus utilization, allow a DNN module to complete processing operations more quickly, and reduce power consumption.

公开/授权文献

US11528033B2 Neural network processor using compression and decompression of activation data to reduce memory bandwidth utilization 公开/授权日：2022-12-13

信息查询

Global Dossier Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N3/00	基于生物学模型的计算机系统
G06N3/02	.采用神经网络模型
G06N3/04	..体系结构，例如，互连拓扑