WRITE COMBINE BUFFER (WCB) FOR DEEP NEURAL NETWORK (DNN) ACCELERATOR

    公开(公告)号:US20230020929A1

    公开(公告)日:2023-01-19

    申请号:US17946311

    申请日:2022-09-16

    IPC分类号: G06F3/06 G06N3/04

    摘要: A compute tile includes a WCB that receives a workload of writing an output tensor of a convolution into a local memory of the compute tile. The local memory may be a SRAM. The WCB receives write transactions. A write transaction includes a data block, which is a part of the output tensor, and metadata describing one or more attributes of the data block. The WCB may store write transactions in its internal buffers. The WCB may determine whether to combine two write transactions, e.g., based on an operation mode or metadata in the write transactions. In embodiments where the WCB determines to combine the two write transactions, the WCB may combine the two write transactions into a new write transaction and write the new write transaction into the local memory or an internal memory of the WCB. The total number of write transactions for the workload can be reduced.

    DECOMPOSING A DECONVOLUTION INTO MULTIPLE CONVOLUTIONS

    公开(公告)号:US20230016455A1

    公开(公告)日:2023-01-19

    申请号:US17935163

    申请日:2022-09-26

    IPC分类号: G06N3/08 G06F17/15

    摘要: A deconvolution can be decomposed into multiple convolutions. Results of the convolutions constitute an output of the deconvolution. Zeros may be added to an input tensor of the deconvolution to generate an upsampled input tensor. Subtensors having the same size as the kernel of the deconvolution may be identified from the upsampled input tensor. A subtensor may include one or more input activations and one or more zeros. Subtensors having same distribution patterns of input activations may be used to generate a reduced kernel. The reduced kernel includes a subset of the kernel. The position of a weight in the reduced kernel may be the same as the positions of an input activation in the subtensor. Multiple reduced kernels may be generated based on multiple subtensors having different distribution patterns of activations. Each of the convolutions may use the input tensor and a different one of the reduced kernels.