-
公开(公告)号:US10831446B2
公开(公告)日:2020-11-10
申请号:US16145569
申请日:2018-09-28
Applicant: Intel Corporation
Inventor: Gregory K. Chen , Raghavan Kumar , Huseyin Ekin Sumbul , Phil Knag , Ram Krishnamurthy , Sasikanth Manipatruni , Amrita Mathuriya , Abhishek Sharma , Ian A. Young
Abstract: A memory device that includes a plurality subarrays of memory cells to store static weights and a plurality of digital full-adder circuits between subarrays of memory cells is provided. The digital full-adder circuit in the memory device eliminates the need to move data from a memory device to a processor to perform machine learning calculations. Rows of full-adder circuits are distributed between sub-arrays of memory cells to increase the effective memory bandwidth and reduce the time to perform matrix-vector multiplications in the memory device by performing bit-serial dot-product primitives in the form of accumulating m 1-bit×n-bit multiplications.
-
公开(公告)号:US20190043560A1
公开(公告)日:2019-02-07
申请号:US16146473
申请日:2018-09-28
Applicant: Intel Corporation
Inventor: Huseyin Ekin Sumbul , Gregory K. Chen , Raghavan Kumar , Phil Knag , Abhishek Sharma , Sasikanth Manipatruni , Amrita Mathuriya , Ram Krishnamurthy , Ian A. Young
IPC: G11C11/418 , G06F7/544 , G06F9/30 , G11C13/00 , G11C11/419
Abstract: A memory circuit has compute-in-memory circuitry that enables a multiply-accumulate (MAC) operation based on shared charge. Row access circuitry drives multiple rows of a memory array to multiply a first data word with a second data word stored in the memory array. The row access circuitry drives the multiple rows based on the bit pattern of the first data word. Column access circuitry drives a column of the memory array when the rows are driven. Accessed rows discharge the column line in an accumulative fashion. Sensing circuitry can sense voltage on the column line. A processor in the memory circuit computes a MAC value based on the voltage sensed on the column.
-
公开(公告)号:US10148426B2
公开(公告)日:2018-12-04
申请号:US14881121
申请日:2015-10-12
Applicant: Intel Corporation
Inventor: Michael E. Kounavis , Shay Gueron , Ram Krishnamurthy , Sanu K. Mathew
Abstract: Implementations of Advanced Encryption Standard (AES) encryption and decryption processes are disclosed. In one embodiment of S-box processing, a block of 16 byte values is converted, each byte value being converted from a polynomial representation in GF(256) to a polynomial representation in GF((22)4). Multiplicative inverse polynomial representations in GF((22)4) are computed for each of the corresponding polynomial representations in GF((22)4). Finally corresponding multiplicative inverse polynomial representations in GF((22)4) are converted and an affine transformation is applied to generate corresponding polynomial representations in GF(256). In an alternative embodiment of S-box processing, powers of the polynomial representations are computed and multiplied together in GF(256) to generate multiplicative inverse polynomial representations in GF(256). In an embodiment of inverse-columns-mixing, the 16 byte values are converted from a polynomial representation in GF(256) to a polynomial representation in GF((24)2). A four-by-four matrix is applied to the transformed polynomial representation in GF((24)2) to implement the inverse-columns-mixing.
-
公开(公告)号:US10050778B2
公开(公告)日:2018-08-14
申请号:US14569428
申请日:2014-12-12
Applicant: Intel Corporation
Inventor: Michael E. Kounavis , Shay Gueron , Ram Krishnamurthy , Sanu K. Mathew
Abstract: Implementations of Advanced Encryption Standard (AES) encryption and decryption processes are disclosed. In one embodiment of S-box processing, a block of 16 byte values is converted, each byte value being converted from a polynomial representation in GF(256) to a polynomial representation in GF((22)4). Multiplicative inverse polynomial representations in GF((22)4) are computed for each of the corresponding polynomial representations in GF((22)4). Finally corresponding multiplicative inverse polynomial representations in GF((22)4) are converted and an affine transformation is applied to generate corresponding polynomial representations in GF(256). In an alternative embodiment of S-box processing, powers of the polynomial representations are computed and multiplied together in GF(256) to generate multiplicative inverse polynomial representations in GF(256). In an embodiment of inverse-columns-mixing, the 16 byte values are converted from a polynomial representation in GF(256) to a polynomial representation in GF((24)2). A four-by-four matrix is applied to the transformed polynomial representation in GF((24)2) to implement the inverse-columns-mixing.
-
公开(公告)号:US20160204938A1
公开(公告)日:2016-07-14
申请号:US14881121
申请日:2015-10-12
Applicant: Intel Corporation
Inventor: Michael E. Kounavis , Shay Gueron , Ram Krishnamurthy , Sanu K. Mathew
CPC classification number: H04L9/0631 , G06F7/00 , G06F9/30007 , G06F9/30112 , G06F9/30145 , G06F9/30149 , G06F9/30196 , G06F9/3887 , G06F21/602 , H04L2209/34
Abstract: Implementations of Advanced Encryption Standard (AES) encryption and decryption processes are disclosed. In one embodiment of S-box processing, a block of 16 byte values is converted, each byte value being converted from a polynomial representation in GF(256) to a polynomial representation in GF((22)4). Multiplicative inverse polynomial representations in GF((22)4) are computed for each of the corresponding polynomial representations in GF((22)4). Finally corresponding multiplicative inverse polynomial representations in GF((22)4) are converted and an affine transformation is applied to generate corresponding polynomial representations in GF(256). In an alternative embodiment of S-box processing, powers of the polynomial representations are computed and multiplied together in GF(256) to generate multiplicative inverse polynomial representations in GF(256). In an embodiment of inverse-columns-mixing, the 16 byte values are converted from a polynomial representation in GF(256) to a polynomial representation in GF((24)2). A four-by-four matrix is applied to the transformed polynomial representation in GF((24)2) to implement the inverse-columns-mixing.
-
公开(公告)号:US12223615B2
公开(公告)日:2025-02-11
申请号:US16917791
申请日:2020-06-30
Applicant: Intel Corporation
Inventor: Vivek De , Ram Krishnamurthy , Amit Agarwal , Steven Hsu , Monodeep Kar
IPC: G06T3/4007 , G06T7/70 , G06T15/06 , G06T17/20
Abstract: A method comprising: dividing a 3D space into a voxel grid comprising a plurality of voxels; associating a plurality of distance values with the plurality of voxels, each distance value based on a distance to a boundary of an object; selecting an approximate interpolation mode for stepping a ray through a first one or more voxels of the 3D space responsive to the first one or more voxels having distance values greater than a threshold; and detecting the ray reaching a second one or more voxels having distance values less than the first threshold; and responsively selecting a precise interpolation mode for stepping the ray through the second one or more voxels.
-
公开(公告)号:US11751404B2
公开(公告)日:2023-09-05
申请号:US16141025
申请日:2018-09-25
Applicant: Intel Corporation
Inventor: Abhishek Sharma , Gregory Chen , Phil Knag , Ram Krishnamurthy , Raghavan Kumar , Sasikanth Manipatruni , Amrita Mathuriya , Huseyin Sumbul , Ian A. Young
CPC classification number: H10B63/30 , H01L29/66795 , H01L29/785 , H10N70/021 , H10N70/826 , H10N70/882 , H10N70/8833
Abstract: Embodiments herein describe techniques for a semiconductor device including a RRAM memory cell. The RRAM memory cell includes a FinFET transistor and a RRAM storage cell. The FinFET transistor includes a fin structure on a substrate, where the fin structure includes a channel region, a source region, and a drain region. An epitaxial layer is around the source region or the drain region. A RRAM storage stack is wrapped around a surface of the epitaxial layer. The RRAM storage stack includes a resistive switching material layer in contact and wrapped around the surface of the epitaxial layer, and a contact electrode in contact and wrapped around a surface of the resistive switching material layer. The epitaxial layer, the resistive switching material layer, and the contact electrode form a RRAM storage cell. Other embodiments may be described and/or claimed.
-
公开(公告)号:US11699681B2
公开(公告)日:2023-07-11
申请号:US16727779
申请日:2019-12-26
Applicant: Intel Corporation
Inventor: Abhishek Sharma , Hui Jae Yoo , Van H. Le , Huseyin Ekin Sumbul , Phil Knag , Gregory K. Chen , Ram Krishnamurthy
IPC: H01L25/065 , G11C11/407
CPC classification number: H01L25/0657 , G11C11/407 , H01L2224/32145 , H01L2224/32225
Abstract: An apparatus is formed. The apparatus includes a stack of semiconductor chips. The stack of semiconductor chips includes a logic chip and a memory stack, wherein, the logic chip includes at least one of a GPU and CPU. The apparatus also includes a semiconductor chip substrate. The stack of semiconductor chips are mounted on the semiconductor chip substrate. At least one other logic chip is mounted on the semiconductor chip substrate. The semiconductor chip substrate includes wiring to interconnect the stack of semiconductor chips to the at least one other logic chip.
-
公开(公告)号:US11625584B2
公开(公告)日:2023-04-11
申请号:US16443548
申请日:2019-06-17
Applicant: Intel Corporation
Inventor: Raghavan Kumar , Gregory K. Chen , Huseyin Ekin Sumbul , Phil Knag , Ram Krishnamurthy
Abstract: Examples described herein relate to a neural network whose weights from a matrix are selected from a set of weights stored in a memory on-chip with a processing engine for generating multiply and carry operations. The number of weights in the set of weights stored in the memory can be less than a number of weights in the matrix thereby reducing an amount of memory used to store weights in a matrix. The weights in the memory can be generated in training using gradients from back propagation. Weights in the memory can be selected using a tabulation hash calculation on entries in a table.
-
40.
公开(公告)号:US11347477B2
公开(公告)日:2022-05-31
申请号:US16586648
申请日:2019-09-27
Applicant: Intel Corporation
Inventor: Huseyin Ekin Sumbul , Gregory K. Chen , Phil Knag , Raghavan Kumar , Ram Krishnamurthy
Abstract: A memory circuit includes a number (X) of multiply-accumulate (MAC) circuits that are dynamically configurable. The MAC circuits can either compute an output based on computations of X elements of the input vector with the weight vector, or to compute the output based on computations of a single element of the input vector with the weight vector, with each element having a one bit or multibit length. A first memory can hold the input vector having a width of X elements and a second memory can store the weight vector. The MAC circuits include a MAC array on chip with the first memory.
-
-
-
-
-
-
-
-
-