-
公开(公告)号:US20210072894A1
公开(公告)日:2021-03-11
申请号:US17012501
申请日:2020-09-04
Inventor: Nitin CHAWLA , Giuseppe DESOLI , Anuj GROVER , Thomas BOESCH , Surinder Pal SINGH , Manuj AYODHYAWASI
Abstract: A memory array arranged as a plurality of memory cells. The memory cells are configured to operate at a determined voltage. A memory management circuitry coupled to the plurality of memory cells tags a first set of the plurality of memory cells as low-voltage cells and tags a second set of the plurality of memory cells as high-voltage cells. A power source provides a low voltage to the first set of memory cells and provides a high voltage to the second set of memory cells based on the tags.
-
公开(公告)号:US20200310761A1
公开(公告)日:2020-10-01
申请号:US16833340
申请日:2020-03-27
Inventor: Michele ROSSI , Giuseppe DESOLI , Thomas BOESCH , Carmine CAPPETTA
Abstract: A system includes an addressable memory array, one or more processing cores, and an accelerator framework coupled to the addressable memory. The accelerator framework includes a Multiply ACcumulate (MAC) hardware accelerator cluster. The MAC hardware accelerator cluster has a binary-to-residual converter, which, in operation, converts binary inputs to a residual number system. Converting a binary input to the residual number system includes a reduction modulo 2m and a reduction modulo 2m−1, where m is a positive integer. A plurality of MAC hardware accelerators perform modulo 2m multiply-and-accumulate operations and modulo 2m−1 multiply-and-accumulate operations using the converted binary input. A residual-to-binary converter generates a binary output based on the output of the MAC hardware accelerators.
-
公开(公告)号:US20190266479A1
公开(公告)日:2019-08-29
申请号:US16280991
申请日:2019-02-20
Inventor: Surinder Pal SINGH , Thomas BOESCH , Giuseppe DESOLI
Abstract: Embodiments of a device include an integrated circuit, a reconfigurable stream switch formed in the integrated circuit along with a plurality of convolution accelerators and an arithmetic unit coupled to the reconfigurable stream switch. The arithmetic unit has at least one input and at least one output. The at least one input is arranged to receive streaming data passed through the reconfigurable stream switch, and the at least one output is arranged to stream resultant data through the reconfigurable stream switch. The arithmetic unit also has a plurality of data paths. At least one of the plurality of data paths is solely dedicated to performance of operations that accelerate an activation function represented in the form of a piece-wise second order polynomial approximation.
-
公开(公告)号:US20180189641A1
公开(公告)日:2018-07-05
申请号:US15423279
申请日:2017-02-02
Inventor: Thomas BOESCH , Giuseppe DESOLI
Abstract: Embodiments are directed towards a hardware accelerator engine that supports efficient mapping of convolutional stages of deep neural network algorithms. The hardware accelerator engine includes a plurality of convolution accelerators, and each one of the plurality of convolution accelerators includes a kernel buffer, a feature line buffer, and a plurality of multiply-accumulate (MAC) units. The MAC units are arranged to multiply and accumulate data received from both the kernel buffer and the feature line buffer. The hardware accelerator engine also includes at least one input bus coupled to an output bus port of a stream switch, at least one output bus coupled to an input bus port of the stream switch, or at least one input bus and at least one output bus hard wired to respective output bus and input bus ports of the stream switch.
-
公开(公告)号:US20180189229A1
公开(公告)日:2018-07-05
申请号:US15423272
申请日:2017-02-02
Inventor: Giuseppe DESOLI , Thomas BOESCH , Nitin CHAWLA , Surinder Pal SINGH , Elio GUIDETTI , Fabio Giuseppe DE AMBROGGI , Tommaso MAJO , Paolo Sergio ZAMBOTTI
Abstract: Embodiments are directed towards a system on chip (SoC) that implements a deep convolutional network heterogeneous architecture. The SoC includes a system bus, a plurality of addressable memory arrays coupled to the system bus, at least one applications processor core coupled to the system bus, and a configurable accelerator framework coupled to the system bus. The configurable accelerator framework is an image and deep convolutional neural network (DCNN) co-processing system. The SoC also includes a plurality of digital signal processors (DSPs) coupled to the system bus, wherein the plurality of DSPs coordinate functionality with the configurable accelerator framework to execute the DCNN.
-
36.
公开(公告)号:US20240354269A1
公开(公告)日:2024-10-24
申请号:US18304938
申请日:2023-04-21
Applicant: STMicroelectronics International N.V.
Inventor: Antonio DE VITA , Thomas BOESCH , Giuseppe DESOLI
IPC: G06F13/374 , G06F9/50
CPC classification number: G06F13/374 , G06F9/5077 , G06F2209/5011
Abstract: A stream switch includes a data router, configuration registers, and arbitration logic. The data router has a plurality of input ports, each having a plurality of associated virtual input channels, and a plurality of output ports, each having a plurality of associated virtual output channels. The data router transmits data streams from input ports to one or more output ports of the plurality of output ports. The configuration registers store configuration data associated with the virtual output channels of the respective output ports of the plurality of output ports. The stored configuration data identifies a source input port and virtual input channel ID associated with the virtual output channel of the output port. The arbitration logic allocates bandwidth of the data router based on request signals associated with virtual input channels of the input ports and the configuration data associated with the virtual output channels.
-
公开(公告)号:US20240330660A1
公开(公告)日:2024-10-03
申请号:US18426128
申请日:2024-01-29
Applicant: STMicroelectronics International N.V.
Inventor: Carmine CAPPETTA , Surinder Pal SINGH , Giuseppe DESOLI , Thomas BOESCH , Michele ROSSI
IPC: G06N3/0464 , G06N3/063
CPC classification number: G06N3/0464 , G06N3/063
Abstract: A neural network includes an internal storage unit. The internal storage unit stores feature data received from a memory external to the neural network. The internal storage unit reads the feature data to a hardware accelerator of the neural network. The internal storage unit adapts a storage pattern of the feature data and a read pattern of the feature data to enhance the efficiency of the hardware accelerator.
-
38.
公开(公告)号:US20240330677A1
公开(公告)日:2024-10-03
申请号:US18192589
申请日:2023-03-29
Applicant: STMicroelectronics International N.V.
Inventor: Carmine CAPPETTA , Paolo Sergio ZAMBOTTI , Thomas BOESCH , Giuseppe DESOLI
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: A neural network is able to reconfigure hardware accelerators on-the-fly without stopping downstream hardware accelerators. The neural network inserts a reconfiguration tag into the stream of feature data. If the reconfiguration tag matches an identification of a hardware accelerator, a reconfiguration process is initiated. Upstream hardware accelerators are paused while downstream hardware accelerators continue to operate. An epoch controller reconfigures the hardware accelerator via a bus. Normal operation of the neural network then resumes.
-
公开(公告)号:US20240330399A1
公开(公告)日:2024-10-03
申请号:US18194108
申请日:2023-03-31
Applicant: STMicroelectronics International N.V.
Inventor: Carmine CAPPETTA , Surinder Pal SINGH , Giuseppe DESOLI , Thomas BOESCH
IPC: G06F17/15
CPC classification number: G06F17/15
Abstract: A neural network includes an internal storage unit. The internal storage unit stores feature data received from a memory external to the neural network. The internal storage unit reads the feature data to a hardware accelerator of the neural network. The internal storage unit adapts a storage pattern of the feature data and a read pattern of the feature data to enhance the efficiency of the hardware accelerator.
-
40.
公开(公告)号:US20240281646A1
公开(公告)日:2024-08-22
申请号:US18192629
申请日:2023-03-29
Applicant: STMicroelectronics International N.V.
Inventor: Michele ROSSI , Giuseppe DESOLI , Thomas BOESCH
IPC: G06N3/063 , G06F17/15 , G06F17/16 , G06N3/0464
CPC classification number: G06N3/063 , G06F17/153 , G06F17/16 , G06N3/0464
Abstract: A hardware accelerator includes a plurality of functional circuits, a stream switch, and a plurality of stream engines. The stream engines are coupled to the functional circuits via the stream switch, and in operation, generate data streaming requests to stream data to and from the functional circuits. The functional circuits include at least one convolutional cluster, which includes a plurality of processing elements coupled together via a reconfigurable crossbar switch. The reconfigurable crossbar switch is coupled to the stream switch, and in operation, streams data to, from, and between processing elements of the processing cluster.
-
-
-
-
-
-
-
-
-