Abstract:
Adaptive scaling digital techniques attempt to place the system close to the timing failure so as to maximize energy efficiency. Rapid recovery from potential failures is usually by slowing the system clock and/or providing razor solutions (instruction replay.) These techniques compromise the throughput. This application presents a technique to provide local in-situ fault resilience based on dynamic slack borrowing. This technique is non-intrusive (needs no architecture modification) and has minimal impact on throughput.
Abstract:
A memory array includes a plurality of bit-cells arranged as a set of rows of bit-cells intersecting a plurality of columns. The memory array also includes a plurality of in-memory-compute (IMC) cells arranged as a set of rows of IMC cells intersecting the plurality of columns of the memory array. Each of the IMC cells of the memory array includes a first bit-cell having a latch, a write-bit line and a complementary write-bit line, and a second bit-cell having a latch, a write-bit line and a complementary write-bit line, wherein the write-bit line of the first bit-cell is coupled to the complementary write-bit line of the second bit-cell and the complementary write-bit line of the first bit-cell is coupled to the write-bit line of the second bit-cell.
Abstract:
SRAM cells are connected in columns by bit lines and connected in rows by first and second word lines coupled to first and second data storage sides of the SRAM cells. First the first word lines are actuated in parallel and then next the second word lines are actuated in parallel in first and second phases, respectively, of an in-memory compute operation. Bit line voltages in the first and second phases are processed to generate an in-memory compute operation decision. A low supply node reference voltage for the SRAM cells is selectively modulated between a ground voltage and a negative voltage. The first data storage side receives the negative voltage and the second data storage side receives the ground voltage during the second phase. Conversely, the second data storage side receives the negative voltage and the first data storage side receives the ground voltage during the first phase.
Abstract:
An in-memory computation circuit includes a memory array including sub-arrays of with SRAM cells connected in rows by word lines and in columns by local bit lines. A row controller circuit selectively actuates one word line per sub-array for an in-memory compute operation. A global bit line is capacitively coupled to many local bit lines in either a column direction or row direction. An analog global output voltage on each global bit line is an average of local bit line voltages on the capacitively coupled local bit lines. The analog global output voltage is sampled and converted by an analog-to-digital converter (ADC) circuit to generate a digital decision signal output for the in-memory compute operation.
Abstract:
Systems and devices are provided to enable granular control over a retention or active state of each of a plurality of memory circuits, such as a plurality of memory cell arrays, within a memory. Each respective memory array of the plurality of memory arrays is coupled to a respective ballast driver and a respective active memory signal switch for the respective memory array. One or more voltage regulators are coupled to a ballast driver gate node and to a bias node of at least one of the respective memory arrays. In operation, the respective active memory signal switch for a respective memory array causes the respective memory array to transition between an active state for the respective memory array and a retention state for the respective memory array.
Abstract:
An in-memory compute (IMC) device includes a compute array having a first plurality of cells. The compute array is arranged as a plurality of rows of cells intersecting a plurality of columns of cells. Each cell of the first plurality of cells is identifiable by its corresponding row and column. The IMC device also includes a plurality of computation engines and a plurality of bias engines. Each computation engine is respectively formed in a different one of a second plurality of cells, wherein the second plurality of cells is formed from cells of the first plurality. Each computation engine is formed at a respective row and column intersection. Each bias engine of the plurality of bias engines is arranged to computationally combine an output from at least one of the plurality of computation engines with a respective bias value.
Abstract:
A memory array includes sub-arrays with memory cells arranged in a row-column matrix where each row includes a word line and each sub-array column includes a local bit line. A control circuit supports a first operating mode where only one word line in the memory array is actuated during memory access and a second operating mode where one word line per sub-array is simultaneously actuated during an in-memory computation performed as a function of weight data stored in the memory and applied feature data. Computation circuitry coupling each memory cell to the local bit line for each column of the sub-array logically combines a bit of feature data for the in-memory computation with a bit of weight data to generate a logical output on the local bit line which is charge shared with the global bit line.
Abstract:
The memory array of a circuit includes sub-arrays with memory cells arranged in a row-column matrix where each row includes a word line and each sub-array column includes a local bit line. A control circuit supports two modes of circuit operation: a first mode where only one word line in the memory array is actuated during a memory read and a second mode where one word line per sub-array are simultaneously actuated during the memory read. An input/output circuit for each column includes inputs to the local bit lines of the sub-arrays, a column data output coupled to the bit line inputs, and a sub-array data output coupled to each bit line input. In memory computation operations are performed in the second mode as a function of feature data and weight data stored in the memory.
Abstract:
An in-memory computation circuit includes a memory array with SRAM cells connected in rows by word lines and in columns by bit lines. A row controller circuit simultaneously actuates word lines in parallel for an in-memory compute operation. A column processing circuit includes a clamping circuit that clamps a voltage on the bit line to a level exceeding an SRAM cell bit flip voltage during execution of the in-memory compute operation. The column processing circuit may further include a current mirroring circuit that mirrors the read current developed on each bit line in response to the simultaneous actuation to generate a decision output for the in-memory compute operation. The mirrored read current is integrated by an integration capacitor to generate an output voltage that is converted to a digital signal by an analog-to-digital converter circuit.
Abstract:
An in-memory computation circuit includes a memory array including sub-arrays of with SRAM cells connected in rows by word lines and in columns by bit lines. A row controller circuit selectively actuates word lines across the sub-arrays for an in-memory compute operation. A computation tile circuit for each sub-array includes a column compute circuit for each bit line. Each column compute circuit includes a switched timing circuit that is actuated in response to weight data on the bit line for a duration of time set by an in-memory compute operation enable signal. A current digital-to-analog converter powered by the switched timing circuit operates to generate a drain current having a magnitude controlled by bits of feature data for the in-memory compute operation. The drain current is integrated to generate an output voltage.