DEEP NEURAL NETWORK (DNN) ACCELERATORS WITH HETEROGENEOUS TILING

    公开(公告)号:EP4328802A1

    公开(公告)日:2024-02-28

    申请号:EP23186330.9

    申请日:2023-07-19

    申请人: INTEL Corporation

    摘要: An DNN accelerator includes one or more heterogenous tile sets. A heterogenous tile set includes tiles of different sizes, e.g., PE arrays including different numbers of columns or rows. The DNN accelerator may identify a tile set from the tile sets for running a DNN model based on dimensions of output tensors convolutional layers in the DNN. Within the selected tile set, a tile may be selected for a convolutional layer in the DNN, e.g., based on dimensions of the output tensor of the convolutional layer and the size of the tile. After the tile is selected, the workload for running a convolutional operation of the layer may be partitioned and assigned to individual PEs in the tile by partitioning the output tensor into output tensor segments. The workload of computing an individual output tensor segment can be assigned to an individual PE in the tile.

    KERNEL DECOMPOSITION AND ACTIVATION BROADCASTING IN DEEP NEURAL NETWORKS (DNNS)

    公开(公告)号:EP4345655A1

    公开(公告)日:2024-04-03

    申请号:EP23190177.8

    申请日:2023-08-08

    申请人: Intel Corporation

    摘要: An DNN accelerator may perform 1 × N kernel decomposition to decompose a convolutional kernel into kernel vectors, each of which includes multiple weights. Through the kernel decomposition, a weight operand may be generated from a filter. The DNN accelerator converts an input tensor into input operands. An input operand includes activations and has the same size as the weight operand. The DNN accelerator may read a first activation in the input operand from memory to an internal memory of a first PE and read a second activation in the input operand from the memory to an internal memory of a second PE. The first PE may receive the second activation from the second PE through activation broadcasting between the two PEs and perform MAC operations on the input operand and weight operand. The second PE may perform MAC operations on another input operand in the input tensor and the weight operand.

    DEEP NEURAL NETWORK (DNN) ACCELERATORS WITH WEIGHT LAYOUT REARRANGEMENT

    公开(公告)号:EP4343635A1

    公开(公告)日:2024-03-27

    申请号:EP23186375.4

    申请日:2023-07-19

    申请人: INTEL Corporation

    摘要: An DNN accelerator includes a DMA engine that can rearrange weight data layout. The DMA engine may read a weight tensor from a memory (e.g., DRAM). The weight tensor includes weights arranged in a 3D matrix. The DMA engine may partition the weight tensor into a plurality of virtual banks based on a structure of a PE array, e.g., based on the number of activated PE columns in the PE array. Then the DMA engine may partition a virtual bank into a plurality of virtual sub-banks. The DMA engine may also identify data blocks from different ones of the plurality of virtual sub-banks. A data block may include a plurality of input channels and may have a predetermined spatial size and storage size. The DMA engine form a linear data structure by interleaving the data blocks. The DMA engine can write the linear data structure into another memory (e.g., SRAM).