ACCELERATION OF 1X1 CONVOLUTIONS IN CONVOLUTIONAL NEURAL NETWORKS

    公开(公告)号:EP4296900A1

    公开(公告)日:2023-12-27

    申请号:EP23178341.6

    申请日:2023-06-09

    摘要: A convolutional accelerator includes a feature line buffer, a kernel buffer, a multiply-accumulate cluster, and mode control circuitry. In a first mode of operation, the mode control circuitry stores feature data in a feature line buffer and stores kernel data in a kernel buffer. The data stored in the buffers is transferred to the MAC cluster of the convolutional accelerator for processing. In a second mode of operation the mode control circuitry stores feature data in the kernel buffer and stores kernel data in the feature line buffer. The data stored in the buffers is transferred to the MAC cluster of the convolutional accelerator for processing. The second mode of operation may be employed to efficiently process 1xN kernels, where N is an integer greater than or equal to one.

    DATA VOLUME SCULPTOR FOR DEEP LEARNING ACCELERATION

    公开(公告)号:EP3531347A1

    公开(公告)日:2019-08-28

    申请号:EP19159074.4

    申请日:2019-02-25

    IPC分类号: G06N3/04 G06N3/063

    摘要: Embodiments of a device include on-board memory, an applications processor, a digital signal processor cluster, a configurable accelerator framework, and at least one communication bus architecture. The communication bus communicatively couples the applications processor, the digital signal processor cluster, and the configurable accelerator framework to the on-board memory. The configurable accelerator framework includes a reconfigurable stream switch and a data volume sculpting unit, which has an input and an output coupled to the reconfigurable stream switch. The data volume sculpting unit has a counter, a comparator, and a controller. The data volume sculpting unit is arranged to receive (1004) a stream of feature map data that forms a three-dimensional feature map. The three-dimensional feature map is formed as a plurality of two-dimensional data planes.

    RECONFIGURABLE, STREAMING-BASED CLUSTERS OF PROCESSING ELEMENTS, AND MULTI-MODAL USE THEREOF

    公开(公告)号:EP4428759A2

    公开(公告)日:2024-09-11

    申请号:EP24155570.5

    申请日:2024-02-02

    摘要: A hardware accelerator (110) includes processing elements (172) of a neural network, each processing element having a memory (104); a stream switch (155); stream engines (150) coupled to functional circuits (102, 160, 165, 180) via the stream switch (155), wherein the stream engines (150), in operation, generate data streaming requests to stream data to and from functional circuits of the plurality of functional circuits (102, 160, 165, 180); a first system bus interface (158) coupled to the stream engines (150); a second system bus interface (184) coupled to the processing elements (172); and mode control circuitry (176), which, in operation, sets respective modes of operation for the plurality of processing elements (172). The modes of operation include: a compute mode of operation in which the processing element (172) performs computing operations using the memory (104) associated with the processing element; and a memory mode of operation in which the memory (104) associated with the processing element (172) performs memory operations, bypassing the stream switch (155), via the second system bus interface (184).

    RECONFIGURABLE, STREAMING-BASED CLUSTERS OF PROCESSING ELEMENTS, AND MULTI-MODAL USE THEREOF

    公开(公告)号:EP4428759A3

    公开(公告)日:2024-10-09

    申请号:EP24155570.5

    申请日:2024-02-02

    摘要: A hardware accelerator (110) includes processing elements (172) of a neural network, each processing element having a memory (104); a stream switch (155); stream engines (150) coupled to functional circuits (102, 160, 165, 180) via the stream switch (155), wherein the stream engines (150), in operation, generate data streaming requests to stream data to and from functional circuits of the plurality of functional circuits (102, 160, 165, 180); a first system bus interface (158) coupled to the stream engines (150); a second system bus interface (184) coupled to the processing elements (172); and mode control circuitry (176), which, in operation, sets respective modes of operation for the plurality of processing elements (172). The modes of operation include: a compute mode of operation in which the processing element (172) performs computing operations using the memory (104) associated with the processing element; and a memory mode of operation in which the memory (104) associated with the processing element (172) performs memory operations, bypassing the stream switch (155), via the second system bus interface (184).

    CONFIGURABLE STREAM SWITCH WITH VIRTUAL CHANNELS FOR THE SHARING OF I/O PORTS IN STREAM-BASED ARCHITECTURES

    公开(公告)号:EP4455895A1

    公开(公告)日:2024-10-30

    申请号:EP24162857.7

    申请日:2024-03-12

    摘要: A stream switch (130) includes a data router (132), configuration registers (138), and arbitration logic (140). The data router (132) has a plurality of input ports (134), each having a plurality of associated virtual input channels, and a plurality of output ports (136), each having a plurality of associated virtual output channels. The data router (132) transmits data streams from input ports (134) to one or more output ports of the plurality of output ports (136). The configuration registers (138) store configuration data associated with the virtual output channels of the respective output ports of the plurality of output ports (136). The stored configuration data identifies a source input port and virtual input channel ID associated with the virtual output channel of the output port. The arbitration logic (140) allocates bandwidth of the data router (132) based on request signals associated with virtual input channels of the input ports (134) and the configuration data associated with the virtual output channels.