-
公开(公告)号:US11593609B2
公开(公告)日:2023-02-28
申请号:US16794062
申请日:2020-02-18
Inventor: Giuseppe Desoli , Carmine Cappetta , Thomas Boesch , Surinder Pal Singh , Saumya Suneja
Abstract: Embodiments of an electronic device include an integrated circuit, a reconfigurable stream switch formed in the integrated circuit along with a plurality of convolution accelerators and a decompression unit coupled to the reconfigurable stream switch. The decompression unit decompresses encoded kernel data in real time during operation of convolutional neural network.
-
公开(公告)号:US11836608B2
公开(公告)日:2023-12-05
申请号:US18056937
申请日:2022-11-18
Inventor: Thomas Boesch , Giuseppe Desoli , Surinder Pal Singh , Carmine Cappetta
CPC classification number: G06N3/063 , G06F9/5027 , H03M7/3082 , H03M7/6005
Abstract: Techniques and systems are provided for implementing a convolutional neural network. One or more convolution accelerators are provided that each include a feature line buffer memory, a kernel buffer memory, and a plurality of multiply-accumulate (MAC) circuits arranged to multiply and accumulate data. In a first operational mode the convolutional accelerator stores feature data in the feature line buffer memory and stores kernel data in the kernel data buffer memory. In a second mode of operation, the convolutional accelerator stores kernel decompression tables in the feature line buffer memory.
-
公开(公告)号:US11531873B2
公开(公告)日:2022-12-20
申请号:US16909673
申请日:2020-06-23
Inventor: Thomas Boesch , Giuseppe Desoli , Surinder Pal Singh , Carmine Cappetta
Abstract: Techniques and systems are provided for implementing a convolutional neural network. One or more convolution accelerators are provided that each include a feature line buffer memory, a kernel buffer memory, and a plurality of multiply-accumulate (MAC) circuits arranged to multiply and accumulate data. In a first operational mode the convolutional accelerator stores feature data in the feature line buffer memory and stores kernel data in the kernel data buffer memory. In a second mode of operation, the convolutional accelerator stores kernel decompression tables in the feature line buffer memory.
-
公开(公告)号:US12106201B2
公开(公告)日:2024-10-01
申请号:US17039653
申请日:2020-09-30
Applicant: STMICROELECTRONICS S.r.l.
Inventor: Carmine Cappetta , Thomas Boesch , Giuseppe Desoli
CPC classification number: G06N3/04 , G06F9/3806 , G06F13/1657 , G06F13/1673 , G06F13/4022 , G06N3/063 , G06T7/11 , G06T2207/20084
Abstract: A convolutional accelerator framework (CAF) has a plurality of processing circuits including one or more convolution accelerators, a reconfigurable hardware buffer configurable to store data of a variable number of input data channels, and a stream switch coupled to the plurality of processing circuits. The reconfigurable hardware buffer has a memory and control circuitry. A number of the variable number of input data channels is associated with an execution epoch. The stream switch streams data of the variable number of input data channels between processing circuits of the plurality of processing circuits and the reconfigurable hardware buffer during processing of the execution epoch. The control circuitry of the reconfigurable hardware buffer configures the memory to store data of the variable number of input data channels, the configuring including allocating a portion of the memory to each of the variable number of input data channels.
-
公开(公告)号:US11442700B2
公开(公告)日:2022-09-13
申请号:US16833340
申请日:2020-03-27
Inventor: Michele Rossi , Giuseppe Desoli , Thomas Boesch , Carmine Cappetta
Abstract: A system includes an addressable memory array, one or more processing cores, and an accelerator framework coupled to the addressable memory. The accelerator framework includes a Multiply ACcumulate (MAC) hardware accelerator cluster. The MAC hardware accelerator cluster has a binary-to-residual converter, which, in operation, converts binary inputs to a residual number system. Converting a binary input to the residual number system includes a reduction modulo 2m and a reduction modulo 2m−1, where m is a positive integer. A plurality of MAC hardware accelerators perform modulo 2m multiply-and-accumulate operations and modulo 2m−1 multiply-and-accumulate operations using the converted binary input. A residual-to-binary converter generates a binary output based on the output of the MAC hardware accelerators.
-
公开(公告)号:US11880759B2
公开(公告)日:2024-01-23
申请号:US18172979
申请日:2023-02-22
Inventor: Giuseppe Desoli , Carmine Cappetta , Thomas Boesch , Surinder Pal Singh , Saumya Suneja
CPC classification number: G06N3/045 , G06F16/2282 , G06F18/217 , G06N3/04 , G06N3/063 , G06N3/08
Abstract: Embodiments of an electronic device include an integrated circuit, a reconfigurable stream switch formed in the integrated circuit along with a plurality of convolution accelerators and a decompression unit coupled to the reconfigurable stream switch. The decompression unit decompresses encoded kernel data in real time during operation of convolutional neural network.
-
公开(公告)号:US11740870B2
公开(公告)日:2023-08-29
申请号:US16833353
申请日:2020-03-27
Inventor: Giuseppe Desoli , Thomas Boesch , Carmine Cappetta , Ugo Maria Iannuzzi
CPC classification number: G06F7/5443 , G06N3/04
Abstract: A Multiple Accumulate (MAC) hardware accelerator includes a plurality of multipliers. The plurality of multipliers multiply a digit-serial input having a plurality of digits by a parallel input having a plurality of bits by sequentially multiplying individual digits of the digit-serial input by the plurality of bits of the parallel input. A result is generated based on the multiplication of the digit-serial input by the parallel input. An accelerator framework may include multiple MAC hardware accelerators, and may be used to implement a convolutional neural network. The MAC hardware accelerators may multiple an input weight by an input feature by sequentially multiplying individual digits of the input weight by the input feature.
-
-
-
-
-
-