-
公开(公告)号:US20200310758A1
公开(公告)日:2020-10-01
申请号:US16833353
申请日:2020-03-27
Inventor: Giuseppe DESOLI , Thomas BOESCH , Carmine CAPPETTA , Ugo Maria IANNUZZI
Abstract: A Multiple Accumulate (MAC) hardware accelerator includes a plurality of multipliers. The plurality of multipliers multiply a digit-serial input having a plurality of digits by a parallel input having a plurality of bits by sequentially multiplying individual digits of the digit-serial input by the plurality of bits of the parallel input. A result is generated based on the multiplication of the digit-serial input by the parallel input. An accelerator framework may include multiple MAC hardware accelerators, and may be used to implement a convolutional neural network. The MAC hardware accelerators may multiple an input weight by an input feature by sequentially multiplying individual digits of the input weight by the input feature.
-
公开(公告)号:US20190266485A1
公开(公告)日:2019-08-29
申请号:US16280960
申请日:2019-02-20
Inventor: Surinder Pal SINGH , Giuseppe DESOLI , Thomas BOESCH
Abstract: Embodiments of a device include an integrated circuit, a reconfigurable stream switch formed in the integrated circuit, and an arithmetic unit coupled to the reconfigurable stream switch. The arithmetic unit has a plurality of inputs and at least one output, and the arithmetic unit is solely dedicated to performance of a plurality of parallel operations. Each one of the plurality of parallel operations carries out a portion of the formula: output=AX+BY+C.
-
公开(公告)号:US20180189642A1
公开(公告)日:2018-07-05
申请号:US15423284
申请日:2017-02-02
Inventor: Thomas BOESCH , Giuseppe DESOLI
Abstract: Embodiments are directed towards a configurable accelerator framework device that includes a stream switch and a plurality of convolution accelerators. The stream switch has a plurality of input ports and a plurality of output ports. Each of the input ports is configurable at run time to unidirectionally pass data to any one or more of the output ports via a stream link. Each one of the plurality of convolution accelerators is configurable at run time to unidirectionally receive input data via at least two of the plurality of stream switch output ports, and each one of the plurality of convolution accelerators is further configurable at run time to unidirectionally communicate output data via an input port of the stream switch.
-
公开(公告)号:US20230153621A1
公开(公告)日:2023-05-18
申请号:US18156704
申请日:2023-01-19
Inventor: Surinder Pal SINGH , Giuseppe DESOLI , Thomas BOESCH
CPC classification number: G06N3/08 , G06N20/00 , G06F17/11 , G06N3/063 , G06F9/3001 , G06F9/30032 , G06F9/30036 , G06N3/045
Abstract: An integrated circuit includes a reconfigurable stream switch and an arithmetic circuit. The stream switch, in operation, streams data. The arithmetic circuit has a plurality of inputs coupled to the reconfigurable stream switch. In operation, the arithmetic circuit generates an output according to AX+BY+C, where A, B and C are vector or scalar constants, and X and Y are data streams streamed to the arithmetic circuit through the reconfigurable stream switch.
-
公开(公告)号:US20230084985A1
公开(公告)日:2023-03-16
申请号:US18056937
申请日:2022-11-18
Inventor: Thomas BOESCH , Giuseppe DESOLI , Surinder Pal SINGH , Carmine CAPPETTA
Abstract: Techniques and systems are provided for implementing a convolutional neural network. One or more convolution accelerators are provided that each include a feature line buffer memory, a kernel buffer memory, and a plurality of multiply-accumulate (MAC) circuits arranged to multiply and accumulate data. In a first operational mode the convolutional accelerator stores feature data in the feature line buffer memory and stores kernel data in the kernel data buffer memory. In a second mode of operation, the convolutional accelerator stores kernel decompression tables in the feature line buffer memory.
-
公开(公告)号:US20230062910A1
公开(公告)日:2023-03-02
申请号:US17461626
申请日:2021-08-30
Inventor: Giuseppe DESOLI , Surinder Pal SINGH , Thomas BOESCH
Abstract: A convolutional neural network includes convolution circuitry. The convolution circuitry performs convolution operations on input tensor values. The convolutional neural network includes requantization circuitry that requantizes convolution values output from the convolution circuitry.
-
公开(公告)号:US20220101086A1
公开(公告)日:2022-03-31
申请号:US17039653
申请日:2020-09-30
Inventor: Carmine CAPPETTA , Thomas BOESCH , Giuseppe DESOLI
Abstract: A convolutional accelerator framework (CAF) has a plurality of processing circuits including one or more convolution accelerators, a reconfigurable hardware buffer configurable to store data of a variable number of input data channels, and a stream switch coupled to the plurality of processing circuits. The reconfigurable hardware buffer has a memory and control circuitry. A number of the variable number of input data channels is associated with an execution epoch. The stream switch streams data of the variable number of input data channels between processing circuits of the plurality of processing circuits and the reconfigurable hardware buffer during processing of the execution epoch. The control circuitry of the reconfigurable hardware buffer configures the memory to store data of the variable number of input data channels, the configuring including allocating a portion of the memory to each of the variable number of input data channels.
-
公开(公告)号:US20210397933A1
公开(公告)日:2021-12-23
申请号:US16909673
申请日:2020-06-23
Inventor: Thomas BOESCH , Giuseppe DESOLI , Surinder Pal SINGH , Carmine CAPPETTA
Abstract: Techniques and systems are provided for implementing a convolutional neural network. One or more convolution accelerators are provided that each include a feature line buffer memory, a kernel buffer memory, and a plurality of multiply-accumulate (MAC) circuits arranged to multiply and accumulate data. In a first operational mode the convolutional accelerator stores feature data in the feature line buffer memory and stores kernel data in the kernel data buffer memory. In a second mode of operation, the convolutional accelerator stores kernel decompression tables in the feature line buffer memory.
-
公开(公告)号:US20210256346A1
公开(公告)日:2021-08-19
申请号:US16794062
申请日:2020-02-18
Inventor: Giuseppe DESOLI , Carmine CAPPETTA , Thomas BOESCH , Surinder Pal SINGH , Saumya SUNEJA
Abstract: Embodiments of an electronic device include an integrated circuit, a reconfigurable stream switch formed in the integrated circuit along with a plurality of convolution accelerators and a decompression unit coupled to the reconfigurable stream switch. The decompression unit decompresses encoded kernel data in real time during operation of convolutional neural network.
-
公开(公告)号:US20210192833A1
公开(公告)日:2021-06-24
申请号:US17194055
申请日:2021-03-05
Inventor: Surinder Pal SINGH , Thomas BOESCH , Giuseppe DESOLI
IPC: G06T15/08 , G06T7/62 , G06T7/11 , G06F16/901 , G06F9/38 , G06K9/00 , G06K9/62 , G06N3/08 , G06N3/04 , G06N3/063
Abstract: A device include on-board memory, an applications processor, a digital signal processor (DSP) cluster, a configurable accelerator framework (CAF), and at least one communication bus architecture. The communication bus communicatively couples the applications processor, the DSP cluster, and the CAF to the on-board memory. The CAF includes a reconfigurable stream switch and data volume sculpting circuitry, which has an input and an output coupled to the reconfigurable stream switch. The data volume sculpting circuitry receives a series of frames, each frame formed as a two dimensional (2D) data structure, and determines a first dimension and a second dimension of each frame of the series of frames. Based on the first and second dimensions, the data volume sculpting circuitry determines for each frame a position and a size of a region-of-interest to be extracted from the respective frame, and extracts from each frame, data in the frame that is within the region-of-interest.
-
-
-
-
-
-
-
-
-