MULTI-PRECISION DIGITAL COMPUTE-IN-MEMORY DEEP NEURAL NETWORK ENGINE FOR FLEXIBLE AND ENERGY EFFICIENT INFERENCING

    公开(公告)号:US20210397974A1

    公开(公告)日:2021-12-23

    申请号:US16941178

    申请日:2020-07-28

    Abstract: Anon-volatile memory structure capable of storing weights for layers of a deep neural network (DNN) and perform an inferencing operation within the structure is presented. An in-array multiplication can be performed between multi-bit valued inputs, or activations, for a layer of the DNN and multi-bit valued weights of the layer. Each bit of a weight value is stored in a binary valued memory cell of the memory array and each bit of the input is applied as a binary input to a word line of the array for the multiplication of the input with the weight. To perform a multiply and accumulate operation, the results of the multiplications are accumulated by adders connected to sense amplifiers along the bit lines of the array. The adders can be configured to multiple levels of precision, so that the same structure can accommodate weights and activations of 8-bit, 4-bit, and 2-bit precision.

    Crosspoint memory architecture for high bandwidth operation with small page buffer

    公开(公告)号:US11099784B2

    公开(公告)日:2021-08-24

    申请号:US16717468

    申请日:2019-12-17

    Abstract: Apparatuses and techniques are described for reading crosspoint arrays of memory cells with high bandwidth and a relatively small page buffer. Multiple crosspoint arrays (XPAs) are read in parallel, with one memory cell per XPA being read, in a bank of XPAs. To reduce the read time, a row can be selected for the XPAs, after which memory cells in different columns are read, one column at a time, while the same row is selected. This avoids the need to transmit commands and a row address for re-selecting the row in each successive read operation. The XPAs may be ungrouped, or one XPA may be accessible at a time in a group. In one option, the XPAs are arranged in sets, either individually or in groups, and one set is accessible at a time.

    IN-STORAGE LOGIC FOR HARDWARE ACCELERATORS

    公开(公告)号:US20210233592A1

    公开(公告)日:2021-07-29

    申请号:US16775639

    申请日:2020-01-29

    Abstract: Systems and methods for performing in-storage logic operations using one or more memory cell transistors and a programmable sense amplifier are described. The logic operations may comprise basic Boolean logic operations (e.g., OR and AND operations) or secondary Boolean logic operations (e.g., XOR and IMP operations). The one or more memory cell transistors may be used for storing user data during a first time period and then used for performing a logic operation during a second time period subsequent to the first time period. During the logic operation, a first memory cell transistor of the one or more memory cell transistors may be programmed with a threshold voltage that corresponds with a first input operand value and then a gate voltage bias may be applied to the first memory cell transistor during the logic operation that corresponds with a second input operand value.

    METHODS TO TOLERATE PROGRAMMING AND RETENTION ERRORS OF CROSSBAR MEMORY ARRAYS

    公开(公告)号:US20210117500A1

    公开(公告)日:2021-04-22

    申请号:US16912717

    申请日:2020-06-26

    Abstract: Systems and methods for reducing the impact of defects within a crossbar memory array when performing multiplication operations in which multiple control lines are concurrently selected are described. A group of memory cells within the crossbar memory array may be controlled by a local word line that is controlled by a local word line gating unit that may be configured to prevent the local word line from being biased to a selected word line voltage during an operation; the local word line may instead be set to a disabling voltage during the operation such that the memory cell currents through the group of memory cells are eliminated. If a defect has caused a short within one of the memory cells of the group of memory cells, then the local word line gating unit may be programmed to hold the local word line at the disabling voltage during multiplication operations.

    MULTI-RESISTANCE MRAM
    6.
    发明申请

    公开(公告)号:US20210083173A1

    公开(公告)日:2021-03-18

    申请号:US17109291

    申请日:2020-12-02

    Abstract: Apparatuses, systems, and methods are disclosed for magnetoresistive random access memory. A magnetic tunnel junction (MTJ) for storing data may include a reference layer. A free layer of an MTJ may be separated from a reference layer by a barrier layer. A free layer may be configured such that one or more resistance states for an MTJ correspond to one or more positions of a magnetic domain wall within the free layer. A domain stabilization layer may be coupled to a portion of a free layer, and may be configured to prevent migration of a domain wall into the portion of the free layer.

    Vector-matrix multiplication using non-volatile memory cells

    公开(公告)号:US10528643B1

    公开(公告)日:2020-01-07

    申请号:US16052420

    申请日:2018-08-01

    Abstract: Technology is described herein for performing multiplication using non-volatile memory cells. A multiplicand may be stored a node that includes multiple non-volatile memory cells. Each memory cell in a node may be programmed to one of two physical states, with each non-volatile memory cell storing a different bit of the multiplicand. Multiplication may be performed by applying a multiply voltage to the node of memory cells and processing memory cell currents from the memory cells in the node. The memory cell current from each memory cell in the node is multiplied by a different power of two. The multiplied signals are summed to generate a “result signal,’ which represents a product of the multiplier and a multiplicand stored in the node. If desired, “binary memory cells” may be used to perform multiplication. Vector/vector and vector/matrix multiplication may also be performed.

    Accelerating sparse matrix multiplication in storage class memory-based convolutional neural network inference

    公开(公告)号:US11568200B2

    公开(公告)日:2023-01-31

    申请号:US16653346

    申请日:2019-10-15

    Abstract: Techniques are presented for accelerating in-memory matrix multiplication operations for a convolution neural network (CNN) inference in which the weights of a filter are stored in the memory of a storage class memory device, such as a ReRAM or phase change memory based device. To improve performance for inference operations when filters exhibit sparsity, a zero column index and a zero row index are introduced to account for columns and rows having all zero weight values. These indices can be saved in a register on the memory device and when performing a column/row oriented matrix multiplication, if the zero row/column index indicates that the column/row contains all zero weights, the access of the corresponding bit/word line is skipped as the result will be zero regardless of the input.

    COMPUTE-IN-MEMORY DEEP NEURAL NETWORK INFERENCE ENGINE USING LOW-RANK APPROXIMATION TECHNIQUE

    公开(公告)号:US20210406672A1

    公开(公告)日:2021-12-30

    申请号:US16912846

    申请日:2020-06-26

    Abstract: Non-volatile memory structures for performing compute in memory inferencing for neural networks are presented. To improve performance, both in terms of speed and energy consumption, weight matrices are replaced with their singular value decomposition (SVD) and use of a low rank approximations (LRAs). The decomposition matrices can be stored in a single array, with the resultant LRA matrices requiring fewer weight values to be stored. The reduced sizes of the LRA matrices allow for inferencing to be performed more quickly and with less power. In a high performance and energy efficiency mode, a reduced rank for the SVD matrices stored on a memory die is determined and used to increase performance and reduce power needed for an inferencing operation.

Patent Agency Ranking