-
公开(公告)号:US20190042160A1
公开(公告)日:2019-02-07
申请号:US16147024
申请日:2018-09-28
Applicant: Intel Corporation
Inventor: Raghavan KUMAR , Phil KNAG , Gregory K. CHEN , Huseyin Ekin SUMBUL , Sasikanth MANIPATRUNI , Amrita MATHURIYA , Abhishek SHARMA , Ram KRISHNAMURTHY , Ian A. YOUNG
IPC: G06F3/06 , G04F10/00 , G11C11/419 , G11C13/00 , G11C11/418
Abstract: A memory circuit has compute-in-memory (CIM) circuitry that performs computations based on time-to-digital conversion (TDC). The memory circuit includes an array of memory cells addressable with column address and row address. The memory circuit includes CIM sense circuitry to sense a voltage for multiple memory cells triggered together. The CIM sense circuitry including a TDC circuit to convert a time for discharge of the multiple memory cells to a digital value. A processing circuit determines a value of the multiple memory cells based on the digital value.
-
公开(公告)号:US20230401434A1
公开(公告)日:2023-12-14
申请号:US18237887
申请日:2023-08-24
Applicant: Intel Corporation
Inventor: Ram KRISHNAMURTHY , Gregory K. CHEN , Raghavan KUMAR , Phil KNAG , Huseyin Ekin SUMBUL
CPC classification number: G06N3/063 , G06F7/523 , G06F9/3893 , G06F9/30098 , G06F7/5095 , G06F7/5443
Abstract: An apparatus is described. The apparatus includes a long short term memory (LSTM) circuit having a multiply accumulate circuit (MAC). The MAC circuit has circuitry to rely on a stored product term rather than explicitly perform a multiplication operation to determine the product term if an accumulation of differences between consecutive, preceding input values has not reached a threshold.
-
公开(公告)号:US20220165735A1
公开(公告)日:2022-05-26
申请号:US17670248
申请日:2022-02-11
Applicant: Intel Corporation
Inventor: Abhishek SHARMA , Noriyuki SATO , Sarah ATANASOV , Huseyin Ekin SUMBUL , Gregory K. CHEN , Phil KNAG , Ram KRISHNAMURTHY , Hui Jae YOO , Van H. LE
IPC: H01L27/108 , H01L27/12 , G11C11/4096
Abstract: Examples herein relate to a memory device comprising an eDRAM memory cell, the eDRAM memory cell can include a write circuit formed at least partially over a storage cell and a read circuit formed at least partially under the storage cell; a compute near memory device bonded to the memory device; a processor; and an interface from the memory device to the processor. In some examples, circuitry is included to provide an output of the memory device to emulate output read rate of an SRAM memory device comprises one or more of: a controller, a multiplexer, or a register. Bonding of a surface of the memory device can be made to a compute near memory device or other circuitry. In some examples, a layer with read circuitry can be bonded to a layer with storage cells. Any layers can be bonded together using techniques described herein.
-
14.
公开(公告)号:US20190138893A1
公开(公告)日:2019-05-09
申请号:US16147176
申请日:2018-09-28
Applicant: Intel Corporation
Inventor: Abhishek SHARMA , Jack T. KAVALIEROS , Ian A. YOUNG , Ram KRISHNAMURTHY , Sasikanth MANIPATRUNI , Uygar AVCI , Gregory K. CHEN , Amrita MATHURIYA , Raghavan KUMAR , Phil KNAG , Huseyin Ekin SUMBUL , Nazila HARATIPOUR , Van H. LE
IPC: G06N3/063 , H01L27/108 , H01L27/11 , H01L27/11502 , G06N3/04 , G06F17/16
Abstract: An apparatus is described. The apparatus includes a compute-in-memory (CIM) circuit for implementing a neural network disposed on a semiconductor chip. The CIM circuit includes a mathematical computation circuit coupled to a memory array. The memory array includes an embedded dynamic random access memory (eDRAM) memory array. Another apparatus is described. The apparatus includes a compute-in-memory (CIM) circuit for implementing a neural network disposed on a semiconductor chip. The CIM circuit includes a mathematical computation circuit coupled to a memory array. The mathematical computation circuit includes a switched capacitor circuit. The switched capacitor circuit includes a back-end-of-line (BEOL) capacitor coupled to a thin film transistor within the metal/dielectric layers of the semiconductor chip. Another apparatus is described. The apparatus includes a compute-in-memory (CIM) circuit for implementing a neural network disposed on a semiconductor chip. The CIM circuit includes a mathematical computation circuit coupled to a memory array. The mathematical computation circuit includes an accumulation circuit. The accumulation circuit includes a ferroelectric BEOL capacitor to store a value to be accumulated with other values stored by other ferroelectric BEOL capacitors.
-
公开(公告)号:US20190102359A1
公开(公告)日:2019-04-04
申请号:US16147036
申请日:2018-09-28
Applicant: Intel Corporation
Inventor: Phil KNAG , Gregory K. CHEN , Raghavan KUMAR , Huseyin Ekin SUMBUL , Abhishek Sharma , Sasikanth Manipatruni , Amrita Mathuriya , Ram A. Krishnamurthy , Ian A. Young
IPC: G06F17/16 , G11C11/419 , G11C11/418 , G11C7/10 , G06F9/30 , G11C11/56
Abstract: A binary CIM circuit enables all memory cells in a memory array to be effectively accessible simultaneously for computation using fixed pulse widths on the wordlines and equal capacitance on the bitlines. The fixed pulse widths and equal capacitance ensure that a minimum voltage drop in the bitline represents one least significant bit (LSB) so that the bitline voltage swing remains safely within the maximum allowable range. The binary CIM circuit maximizes the effective memory bandwidth of a memory array for a given maximum voltage range of bitline voltage.
-
公开(公告)号:US20190042199A1
公开(公告)日:2019-02-07
申请号:US16147004
申请日:2018-09-28
Applicant: Intel Corporation
Inventor: Huseyin Ekin SUMBUL , Phil KNAG , Gregory K. CHEN , Raghavan KUMAR , Abhishek SHARMA , Sasikanth MANIPATRUNI , Amrita MATHURIYA , Ram KRISHNAMURTHY , Ian A. YOUNG
Abstract: Compute-in memory circuits and techniques are described. In one example, a memory device includes an array of memory cells, the array including multiple sub-arrays. Each of the sub-arrays receives a different voltage. The memory device also includes capacitors coupled with conductive access lines of each of the multiple sub-arrays and circuitry coupled with the capacitors, to share charge between the capacitors in response to a signal. In one example, computing device, such as a machine learning accelerator, includes a first memory array and a second memory array. The computing device also includes an analog processor circuit coupled with the first and second memory arrays to receive first analog input voltages from the first memory array and second analog input voltages from the second memory array and perform one or more operations on the first and second analog input voltages, and output an analog output voltage.
-
公开(公告)号:US20230334006A1
公开(公告)日:2023-10-19
申请号:US18212079
申请日:2023-06-20
Applicant: Intel Corporation
Inventor: Huseyin Ekin SUMBUL , Gregory K. CHEN , Phil KNAG , Raghavan KUMAR , Ram KRISHNAMURTHY
CPC classification number: G06F15/8046 , G06F17/153 , G06N3/063
Abstract: A compute near memory (CNM) convolution accelerator enables a convolutional neural network (CNN) to use dedicated acceleration to achieve efficient in-place convolution operations with less impact on memory and energy consumption. A 2D convolution operation is reformulated as 1D row-wise convolution. The 1D row-wise convolution enables the CNM convolution accelerator to process input activations row-by-row, while using the weights one-by-one. Lightweight access circuits provide the ability to stream both weights and input rows as vectors to MAC units, which in turn enables modules of the CNM convolution accelerator to implement convolution for both [1×1] and chosen [n×n] sized filters.
-
公开(公告)号:US20230297819A1
公开(公告)日:2023-09-21
申请号:US18201291
申请日:2023-05-24
Applicant: Intel Corporation
Inventor: Ram KRISHNAMURTHY , Gregory K. CHEN , Raghavan KUMAR , Phil KNAG , Huseyin Ekin SUMBUL , Deepak Vinayak KADETOTAD
CPC classification number: G06N3/063 , G06F17/16 , G06N3/04 , G06F7/5443
Abstract: An apparatus is described. The apparatus includes a circuit to process a binary neural network. The circuit includes an array of processing cores, wherein, processing cores of the array of processing cores are to process different respective areas of a weight matrix of the binary neural network. The processing cores each include add circuitry to add only those weights of an i layer of the binary neural network that are to be effectively multiplied by a non zero nodal output of an i−1 layer of the binary neural network.
-
公开(公告)号:US20200034148A1
公开(公告)日:2020-01-30
申请号:US16586975
申请日:2019-09-28
Applicant: Intel Corporation
Inventor: Huseyin Ekin SUMBUL , Gregory K. CHEN , Phil KNAG , Raghavan KUMAR , Ram KRISHNAMURTHY
Abstract: A compute near memory (CNM) convolution accelerator enables a convolutional neural network (CNN) to use dedicated acceleration to achieve efficient in-place convolution operations with less impact on memory and energy consumption. A 2D convolution operation is reformulated as 1D row-wise convolution. The 1D row-wise convolution enables the CNM convolution accelerator to process input activations row-by-row, while using the weights one-by-one. Lightweight access circuits provide the ability to stream both weights and input row as vectors to MAC units, which in turn enables modules of the CNM convolution accelerator to implement convolution for both [1×1] and chosen [n×n] sized filters.
-
公开(公告)号:US20190065151A1
公开(公告)日:2019-02-28
申请号:US16145569
申请日:2018-09-28
Applicant: Intel Corporation
Inventor: Gregory K. CHEN , Raghavan KUMAR , Huseyin Ekin SUMBUL , Phil KNAG , Ram KRISHNAMURTHY , Sasikanth MANIPATRUNI , Amrita MATHURIYA , Abhishek SHARMA , Ian A. YOUNG
Abstract: A memory device that includes a plurality subarrays of memory cells to store static weights and a plurality of digital full-adder circuits between subarrays of memory cells is provided. The digital full-adder circuit in the memory device eliminates the need to move data from a memory device to a processor to perform machine learning calculations. Rows of full-adder circuits are distributed between sub-arrays of memory cells to increase the effective memory bandwidth and reduce the time to perform matrix-vector multiplications in the memory device by performing bit-serial dot-product primitives in the form of accumulating m 1-bit x n-bit multiplications.
-
-
-
-
-
-
-
-
-