Abstract:
SRAM cells are connected in columns by bit lines and connected in rows by first and second word lines coupled to first and second data storage sides of the SRAM cells. First the first word lines are actuated in parallel and then next the second word lines are actuated in parallel in first and second phases, respectively, of an in-memory compute operation. Bit line voltages in the first and second phases are processed to generate an in-memory compute operation decision. A low supply node reference voltage for the SRAM cells is selectively modulated between a ground voltage and a negative voltage. The first data storage side receives the negative voltage and the second data storage side receives the ground voltage during the second phase. Conversely, the second data storage side receives the negative voltage and the first data storage side receives the ground voltage during the first phase.
Abstract:
An in-memory computation circuit includes a memory array including sub-arrays of with SRAM cells connected in rows by word lines and in columns by local bit lines. A row controller circuit selectively actuates one word line per sub-array for an in-memory compute operation. A global bit line is capacitively coupled to many local bit lines in either a column direction or row direction. An analog global output voltage on each global bit line is an average of local bit line voltages on the capacitively coupled local bit lines. The analog global output voltage is sampled and converted by an analog-to-digital converter (ADC) circuit to generate a digital decision signal output for the in-memory compute operation.
Abstract:
An in-memory compute (IMC) device includes a compute array having a first plurality of cells. The compute array is arranged as a plurality of rows of cells intersecting a plurality of columns of cells. Each cell of the first plurality of cells is identifiable by its corresponding row and column. The IMC device also includes a plurality of computation engines and a plurality of bias engines. Each computation engine is respectively formed in a different one of a second plurality of cells, wherein the second plurality of cells is formed from cells of the first plurality. Each computation engine is formed at a respective row and column intersection. Each bias engine of the plurality of bias engines is arranged to computationally combine an output from at least one of the plurality of computation engines with a respective bias value.
Abstract:
Adaptive scaling digital techniques attempt to place the system close to the timing failure so as to maximize energy efficiency. Rapid recovery from potential failures is usually by slowing the system clock and/or providing razor solutions (instruction replay.) These techniques compromise the throughput. This application presents a technique to provide local in-situ fault resilience based on dynamic slack borrowing. This technique is non-intrusive (needs no architecture modification) and has minimal impact on throughput.
Abstract:
A memory array includes sub-arrays with memory cells arranged in a row-column matrix where each row includes a word line and each sub-array column includes a local bit line. A control circuit supports a first operating mode where only one word line in the memory array is actuated during memory access and a second operating mode where one word line per sub-array is simultaneously actuated during an in-memory computation performed as a function of weight data stored in the memory and applied feature data. Computation circuitry coupling each memory cell to the local bit line for each column of the sub-array logically combines a bit of feature data for the in-memory computation with a bit of weight data to generate a logical output on the local bit line which is charge shared with the global bit line.
Abstract:
The memory array of a circuit includes sub-arrays with memory cells arranged in a row-column matrix where each row includes a word line and each sub-array column includes a local bit line. A control circuit supports two modes of circuit operation: a first mode where only one word line in the memory array is actuated during a memory read and a second mode where one word line per sub-array are simultaneously actuated during the memory read. An input/output circuit for each column includes inputs to the local bit lines of the sub-arrays, a column data output coupled to the bit line inputs, and a sub-array data output coupled to each bit line input. In memory computation operations are performed in the second mode as a function of feature data and weight data stored in the memory.
Abstract:
An in-memory computation circuit includes a memory array with SRAM cells connected in rows by word lines and in columns by bit lines. A row controller circuit simultaneously actuates word lines in parallel for an in-memory compute operation. A column processing circuit includes a clamping circuit that clamps a voltage on the bit line to a level exceeding an SRAM cell bit flip voltage during execution of the in-memory compute operation. The column processing circuit may further include a current mirroring circuit that mirrors the read current developed on each bit line in response to the simultaneous actuation to generate a decision output for the in-memory compute operation. The mirrored read current is integrated by an integration capacitor to generate an output voltage that is converted to a digital signal by an analog-to-digital converter circuit.
Abstract:
An in-memory computation circuit includes a memory array including sub-arrays of with SRAM cells connected in rows by word lines and in columns by bit lines. A row controller circuit selectively actuates word lines across the sub-arrays for an in-memory compute operation. A computation tile circuit for each sub-array includes a column compute circuit for each bit line. Each column compute circuit includes a switched timing circuit that is actuated in response to weight data on the bit line for a duration of time set by an in-memory compute operation enable signal. A current digital-to-analog converter powered by the switched timing circuit operates to generate a drain current having a magnitude controlled by bits of feature data for the in-memory compute operation. The drain current is integrated to generate an output voltage.
Abstract:
A memory management circuit stores information indicative of reliability-types of regions of a memory array. The memory management circuitry responds to a request to allocate memory in the memory array to a process by determining a request type associated with the request to allocate memory. Memory of the memory array is allocated to the process based on the request type associated with the request to allocate memory and the stored information indicative of reliability-types of regions of the memory array. The memory array may be a shared memory array. The memory array may be organized into rows and columns, and the regions of the memory array may be the rows of the memory array.
Abstract:
The memory array of a memory includes sub-arrays with memory cells arranged in a row-column matrix where each row includes a word line and each sub-array column includes a local bit line. A row decoder circuit supports two modes of memory circuit operation: a first mode where only one word line in the memory array is actuated during a memory read and a second mode where one word line per sub-array are simultaneously actuated during the memory read. An input/output circuit for each column includes inputs to the local bit lines of the sub-arrays, a column data output coupled to the bit line inputs, and a sub-array data output coupled to each bit line input. Both BIST and ATPG testing of the input/output circuit are supported. For BIST testing, multiple data paths between the bit line inputs and the column data output are selectively controlled to provide complete circuit testing.