-
21.
公开(公告)号:US20240281646A1
公开(公告)日:2024-08-22
申请号:US18192629
申请日:2023-03-29
Applicant: STMicroelectronics International N.V.
Inventor: Michele ROSSI , Giuseppe DESOLI , Thomas BOESCH
IPC: G06N3/063 , G06F17/15 , G06F17/16 , G06N3/0464
CPC classification number: G06N3/063 , G06F17/153 , G06F17/16 , G06N3/0464
Abstract: A hardware accelerator includes a plurality of functional circuits, a stream switch, and a plurality of stream engines. The stream engines are coupled to the functional circuits via the stream switch, and in operation, generate data streaming requests to stream data to and from the functional circuits. The functional circuits include at least one convolutional cluster, which includes a plurality of processing elements coupled together via a reconfigurable crossbar switch. The reconfigurable crossbar switch is coupled to the stream switch, and in operation, streams data to, from, and between processing elements of the processing cluster.
-
22.
公开(公告)号:US20240220777A1
公开(公告)日:2024-07-04
申请号:US18176315
申请日:2023-02-28
Inventor: Francesca GIRARDI , Giuseppe DESOLI , Ruggero SUSELLA , Thomas BOESCH , Paolo Sergio ZAMBOTTI
IPC: G06N3/0464
CPC classification number: G06N3/0464
Abstract: A hardware accelerator includes functional circuits and streaming engines. An interface is coupled to the plurality of streaming engines. The interface, in operation, performs stream cipher operations on data words associated with data streaming requests. The performing of a stream cipher operation on a data word includes generating a mask based on an encryption ID associated with a streaming engine of the plurality of streaming engines and an address associated with the data word, and XORing the generated mask with the data word. The hardware accelerator may include configuration registers to store configuration information indicating a respective security state associated with functional circuits and streaming engine of the hardware accelerator, which may be used to control performance of operations by the hardware accelerator.
-
公开(公告)号:US20240012871A1
公开(公告)日:2024-01-11
申请号:US17859769
申请日:2022-07-07
Inventor: Antonio DE VITA , Thomas BOESCH , Giuseppe DESOLI
CPC classification number: G06F17/15 , G06F7/5443
Abstract: A convolutional accelerator includes a feature line buffer, a kernel buffer, a multiply-accumulate cluster, and iteration control circuitry. The convolutional accelerator, in operation, convolves a kernel with a streaming feature data tensor. The convolving includes decomposing the kernel into a plurality of sub-kernels and iteratively convolving the sub-kernels with respective sub-tensors of the streamed feature data tensor. The iteration control circuitry, in operation, defines respective windows of the streamed feature data tensors, the windows corresponding to the sub-tensors.
-
公开(公告)号:US20230418559A1
公开(公告)日:2023-12-28
申请号:US17847817
申请日:2022-06-23
Inventor: Michele ROSSI , Thomas BOESCH , Giuseppe DESOLI
Abstract: A convolutional accelerator includes a feature line buffer, a kernel buffer, a multiply-accumulate cluster, and mode control circuitry. In a first mode of operation, the mode control circuitry stores feature data in a feature line buffer and stores kernel data in a kernel buffer. The data stored in the buffers is transferred to the MAC cluster of the convolutional accelerator for processing. In a second mode of operation the mode control circuitry stores feature data in the kernel buffer and stores kernel data in the feature line buffer. The data stored in the buffers is transferred to the MAC cluster of the convolutional accelerator for processing. The second mode of operation may be employed to efficiently process 1×N kernels, where N is an integer greater than or equal to 1.
-
公开(公告)号:US20210081773A1
公开(公告)日:2021-03-18
申请号:US17023144
申请日:2020-09-16
Inventor: Nitin CHAWLA , Giuseppe DESOLI , Manuj AYODHYAWASI , Thomas BOESCH , Surinder Pal SINGH
IPC: G06N3/063 , G06F1/08 , G06F1/324 , G06F9/50 , G06N3/08 , G06F1/3228 , G06F1/3296
Abstract: Systems and devices are provided to increase computational and/or power efficiency for one or more neural networks via a computationally driven closed-loop dynamic clock control. A clock frequency control word is generated based on information indicative of a current frame execution rate of a processing task of the neural network and a reference clock signal. A clock generator generates the clock signal of neural network based on the clock frequency control word. A reference frequency may be used to generate the clock frequency control word, and the reference frequency may be based on information indicative of a sparsity of data of a training frame.
-
公开(公告)号:US20200310758A1
公开(公告)日:2020-10-01
申请号:US16833353
申请日:2020-03-27
Inventor: Giuseppe DESOLI , Thomas BOESCH , Carmine CAPPETTA , Ugo Maria IANNUZZI
Abstract: A Multiple Accumulate (MAC) hardware accelerator includes a plurality of multipliers. The plurality of multipliers multiply a digit-serial input having a plurality of digits by a parallel input having a plurality of bits by sequentially multiplying individual digits of the digit-serial input by the plurality of bits of the parallel input. A result is generated based on the multiplication of the digit-serial input by the parallel input. An accelerator framework may include multiple MAC hardware accelerators, and may be used to implement a convolutional neural network. The MAC hardware accelerators may multiple an input weight by an input feature by sequentially multiplying individual digits of the input weight by the input feature.
-
公开(公告)号:US20190266485A1
公开(公告)日:2019-08-29
申请号:US16280960
申请日:2019-02-20
Inventor: Surinder Pal SINGH , Giuseppe DESOLI , Thomas BOESCH
Abstract: Embodiments of a device include an integrated circuit, a reconfigurable stream switch formed in the integrated circuit, and an arithmetic unit coupled to the reconfigurable stream switch. The arithmetic unit has a plurality of inputs and at least one output, and the arithmetic unit is solely dedicated to performance of a plurality of parallel operations. Each one of the plurality of parallel operations carries out a portion of the formula: output=AX+BY+C.
-
公开(公告)号:US20180189642A1
公开(公告)日:2018-07-05
申请号:US15423284
申请日:2017-02-02
Inventor: Thomas BOESCH , Giuseppe DESOLI
Abstract: Embodiments are directed towards a configurable accelerator framework device that includes a stream switch and a plurality of convolution accelerators. The stream switch has a plurality of input ports and a plurality of output ports. Each of the input ports is configurable at run time to unidirectionally pass data to any one or more of the output ports via a stream link. Each one of the plurality of convolution accelerators is configurable at run time to unidirectionally receive input data via at least two of the plurality of stream switch output ports, and each one of the plurality of convolution accelerators is further configurable at run time to unidirectionally communicate output data via an input port of the stream switch.
-
29.
公开(公告)号:US20240281397A1
公开(公告)日:2024-08-22
申请号:US18192631
申请日:2023-03-29
Applicant: STMicroelectronics International N.V.
Inventor: Michele ROSSI , Giuseppe DESOLI , Thomas BOESCH
CPC classification number: G06F13/4022 , G06F13/1668
Abstract: A hardware accelerator includes processing elements of a neural network, each processing element having a memory; a stream switch; stream engines coupled to functional circuits via the stream switch, wherein the stream engines, in operation, generate data streaming requests to stream data to and from functional circuits of the plurality of functional circuits; a first system bus interface coupled to the stream engines; a second system bus interface coupled to the processing elements; and mode control circuitry, which, in operation, sets respective modes of operation for the plurality of processing elements. The modes of operation include: a compute mode of operation in which the processing element performs computing operations using the memory associated with the processing element; and a memory mode of operation in which the memory associated with the processing element performs memory operations, bypassing the stream switch, via the second system bus interface.
-
公开(公告)号:US20230153621A1
公开(公告)日:2023-05-18
申请号:US18156704
申请日:2023-01-19
Inventor: Surinder Pal SINGH , Giuseppe DESOLI , Thomas BOESCH
CPC classification number: G06N3/08 , G06N20/00 , G06F17/11 , G06N3/063 , G06F9/3001 , G06F9/30032 , G06F9/30036 , G06N3/045
Abstract: An integrated circuit includes a reconfigurable stream switch and an arithmetic circuit. The stream switch, in operation, streams data. The arithmetic circuit has a plurality of inputs coupled to the reconfigurable stream switch. In operation, the arithmetic circuit generates an output according to AX+BY+C, where A, B and C are vector or scalar constants, and X and Y are data streams streamed to the arithmetic circuit through the reconfigurable stream switch.
-
-
-
-
-
-
-
-
-