MULTI-PRECISION DIGITAL COMPUTE-IN-MEMORY DEEP NEURAL NETWORK ENGINE FOR FLEXIBLE AND ENERGY EFFICIENT INFERENCING

    公开(公告)号:US20210397974A1

    公开(公告)日:2021-12-23

    申请号:US16941178

    申请日:2020-07-28

    Abstract: Anon-volatile memory structure capable of storing weights for layers of a deep neural network (DNN) and perform an inferencing operation within the structure is presented. An in-array multiplication can be performed between multi-bit valued inputs, or activations, for a layer of the DNN and multi-bit valued weights of the layer. Each bit of a weight value is stored in a binary valued memory cell of the memory array and each bit of the input is applied as a binary input to a word line of the array for the multiplication of the input with the weight. To perform a multiply and accumulate operation, the results of the multiplications are accumulated by adders connected to sense amplifiers along the bit lines of the array. The adders can be configured to multiple levels of precision, so that the same structure can accommodate weights and activations of 8-bit, 4-bit, and 2-bit precision.

    Dynamic resource management in circuit bound array architecture

    公开(公告)号:US11081474B1

    公开(公告)日:2021-08-03

    申请号:US16862472

    申请日:2020-04-29

    Abstract: Systems and methods for dynamically assigning memory array die to CMOS die of a plurality of stacked die during memory operations are described. The plurality of stacked die may be vertically stacked and connected together via one or more vertical through-silicon via (TSV) connections. The memory array die may only comprise memory cell structures (e.g., vertical NAND strings) without column decoders, row decoders, charge pumps, sense amplifiers, control circuitry, page registers, or state machines. The CMOS die may contain support circuitry necessary for performing the memory operations, such as read and write memory operations. The one or more vertical TSV connections may allow each memory array die of the plurality of stacked die to communicate with or be electrically connected to one or more CMOS die of the plurality of stacked die.

    Realization of neural networks with ternary inputs and binary weights in NAND memory arrays

    公开(公告)号:US11170290B2

    公开(公告)日:2021-11-09

    申请号:US16368441

    申请日:2019-03-28

    Abstract: Use of a NAND array architecture to realize a binary neural network (BNN) allows for matrix multiplication and accumulation to be performed within the memory array. A unit synapse for storing a weight of a BNN is stored in a pair of series connected memory cells. A binary input is applied as a pattern of voltage values on a pair of word lines connected to the unit synapse to perform the multiplication of the input with the weight by determining whether or not the unit synapse conducts. The results of such multiplications are determined by a sense amplifier, with the results accumulated by a counter. The arrangement can be extended to ternary inputs to realize a ternary-binary network (TBN) by adding a circuit to detect 0 input values and adjust the accumulated count accordingly.

    VERTICAL MAPPING AND COMPUTING FOR DEEP NEURAL NETWORKS IN NON-VOLATILE MEMORY

    公开(公告)号:US20210342676A1

    公开(公告)日:2021-11-04

    申请号:US16899734

    申请日:2020-06-12

    Abstract: Anon-volatile memory structure capable of storing layers of a deep neural network (DNN) and perform an inferencing operation within the structure is presented. A stack of bonded die pairs is connected by through silicon vias. Each bonded die pair includes a memory die, having one or more memory arrays onto which layers of the neural network are mapped, and a peripheral circuitry die, including the control circuits for performing the convolution or multiplication for the bonded die pair. The multiplications can either be done in-array on the memory die or in-logic on the peripheral circuitry die. The arrays can be formed into columns along the vias, allowing an inferencing operation to be performed by propagating an input up and down the columns, with the output of one level being the input of the subsequent layer.

    REALIZATION OF NEURAL NETWORKS WITH TERNARY INPUTS AND BINARY WEIGHTS IN NAND MEMORY ARRAYS

    公开(公告)号:US20200311523A1

    公开(公告)日:2020-10-01

    申请号:US16368441

    申请日:2019-03-28

    Abstract: Use of a NAND array architecture to realize a binary neural network (BNN) allows for matrix multiplication and accumulation to be performed within the memory array. A unit synapse for storing a weight of a BNN is stored in a pair of series connected memory cells. A binary input is applied as a pattern of voltage values on a pair of word lines connected to the unit synapse to perform the multiplication of the input with the weight by determining whether or not the unit synapse conducts. The results of such multiplications are determined by a sense amplifier, with the results accumulated by a counter. The arrangement can be extended to ternary inputs to realize a ternary-binary network (TBN) by adding a circuit to detect 0 input values and adjust the accumulated count accordingly.

    COMPUTATION OF DISCRETE FOURIER TRANSFORMATION (DFT) USING NON-VOLATILE MEMORY ARRAYS

    公开(公告)号:US20230126357A1

    公开(公告)日:2023-04-27

    申请号:US17507185

    申请日:2021-10-21

    Abstract: A non-volatile memory device is configured for in-memory computation of discrete Fourier transformations and their inverses. The real and imaginary components of the twiddle factors are stored as conductance values of memory cells in non-volatile memory arrays having a cross-point structure. The real and imaginary components of inputs are encoded as word line voltages applied to the arrays. Positive and negative valued components of the twiddle factors are stored separately and positive and negative of the inputs are separately applied to the arrays. Real and imaginary parts of the outputs for the discrete Fourier transformation are determined from combinations of the output currents from the arrays.

    RECURRENT NEURAL NETWORK INFERENCE ENGINE WITH GATED RECURRENT UNIT CELL AND NON-VOLATILE MEMORY ARRAYS

    公开(公告)号:US20210397931A1

    公开(公告)日:2021-12-23

    申请号:US16908861

    申请日:2020-06-23

    Abstract: A non-volatile memory device includes arrays of non-volatile memory cells that are configured to the store weights for a recurrent neural network (RNN) inference engine with a gated recurrent unit (GRU) cell. A set three non-volatile memory arrays, such as formed of storage class memory, store a corresponding three sets of weights and are used to perform compute-in-memory inferencing. The hidden state of a previous iteration and an external input are applied to the weights of the first and the of second of the arrays, with the output of the first array used to generate an input to the third array, which also receives the external input. The hidden state of the current generation is generated from the outputs of the second and third arrays.

    KERNEL TRANSFORMATION TECHNIQUES TO REDUCE POWER CONSUMPTION OF BINARY INPUT, BINARY WEIGHT IN-MEMORY CONVOLUTIONAL NEURAL NETWORK INFERENCE ENGINE

    公开(公告)号:US20210192325A1

    公开(公告)日:2021-06-24

    申请号:US16722580

    申请日:2019-12-20

    Abstract: Techniques are presented for performing in-memory matrix multiplication operations for binary input, binary weight valued convolution neural network (CNN) inferencing. The weights of a filter are stored in pairs of memory cells of a storage class memory device, such as a ReRAM or phase change memory based devices. To reduce current consumption, the binary valued filters are transformed into ternary valued filters by taking sums and differences of binary valued filter pairs. The zero valued weights of the transformed filters are stored as a pair of high resistance state memory cells, reducing current consumption during convolution. The results of the in-memory multiplications are pair-wise combined to compensate for the filter transformations. To compensate for zero valued weights, a zero weight register stores the number of zero weights along each bit line and is used to initialize counter values for accumulating the multiplication operations.

Patent Agency Ranking