Abstract:
A circuit includes a memory array with SRAM cells connected in rows by word lines and in columns by bit lines. A row controller circuit simultaneously actuates, through a word line driver circuit for each row, word lines in parallel for an in-memory compute operation. A column processing circuit processes analog voltages developed on the bit lines in response to the simultaneous actuation to generate a decision output for the in-memory compute operation. A bit line precharge circuit generates a precharge voltage for application to each pair of bit lines. The precharge voltage has a first voltage level (not greater than a positive supply voltage for the SRAM cells) when the memory array is operating in a data read/write mode. The precharge voltage has a second voltage level (greater than the first voltage level) in advance of the simultaneous actuation of the word lines for the in-memory compute operation.
Abstract:
A memory array arranged in multiple columns and rows. Computation circuits that each calculate a computation value from cell values in a corresponding column. A column multiplexer cycles through multiple data lines that each corresponds to a computation circuit. Cluster cycle management circuitry determines a number of multiplexer cycles based on a number of columns storing data of a compute cluster. A sensing circuit obtains the computation values from the computation circuits via the column multiplexer as the column multiplexer cycles through the data lines. The sensing circuit combines the obtained computation values over the determined number of multiplexer cycles. A first clock may initiate the multiplexer to cycle through its data lines for the determined number of multiplexer cycles, and a second clock may initiate each individual cycle. The multiplexer or additional circuitry may be utilized to modify the order in which data is written to the columns.
Abstract:
A circuit includes a memory array with SRAM cells connected in rows by word lines and in columns by bit lines. A row controller circuit simultaneously actuates, through a word line driver circuit for each row, word lines in parallel for an in-memory compute operation. A column processing circuit processes analog voltages developed on the bit lines in response to the simultaneous actuation to generate a decision output for the in-memory compute operation. A bit line clamping circuit includes a sensing circuit that compares the analog voltages on a given pair of bit lines to a threshold voltage. A voltage clamp circuit is actuated in response to the comparison to preclude the analog voltages on the given pair of bit lines from decreasing below a clamping voltage level.
Abstract:
The memory array of a memory includes sub-arrays with memory cells arranged in a row-column matrix where each row includes a word line and each sub-array column includes a local bit line. A row decoder circuit supports two modes of memory circuit operation: a first mode where only one word line in the memory array is actuated during a memory read and a second mode where one word line per sub-array are simultaneously actuated during the memory read. An input/output circuit for each column includes inputs to the local bit lines of the sub-arrays, a column data output coupled to the bit line inputs, and a sub-array data output coupled to each bit line input. Both BIST and ATPG testing of the input/output circuit are supported. For BIST testing, multiple data paths between the bit line inputs and the column data output are selectively controlled to provide complete circuit testing.
Abstract:
A memory array arranged as a plurality of memory cells. The memory cells are configured to operate at a determined voltage. A memory management circuitry coupled to the plurality of memory cells tags a first set of the plurality of memory cells as low-voltage cells and tags a second set of the plurality of memory cells as high-voltage cells. A power source provides a low voltage to the first set of memory cells and provides a high voltage to the second set of memory cells based on the tags.
Abstract:
A system includes a random access memory organized into individually addressable words. Streaming access control circuitry is coupled to word lines of the random access memory. The streaming access control circuitry responds to a request to access a plurality of individually addressable words of a determined region of the random access memory by generating control signals to drive the word lines to streamingly access the plurality of individually addressable words of the determined region. The request indicates an offset associated with the determined region and a pattern associated with the streaming access.
Abstract:
Systems and devices are provided to increase computational and/or power efficiency for one or more neural networks via a computationally driven closed-loop dynamic clock control. A clock frequency control word is generated based on information indicative of a current frame execution rate of a processing task of the neural network and a reference clock signal. A clock generator generates the clock signal of neural network based on the clock frequency control word. A reference frequency may be used to generate the clock frequency control word, and the reference frequency may be based on information indicative of a sparsity of data of a training frame.
Abstract:
A memory array arranged in multiple columns and rows. Computation circuits that each calculate a computation value from cell values in a corresponding column. A column multiplexer cycles through multiple data lines that each corresponds to a computation circuit. Cluster cycle management circuitry determines a number of multiplexer cycles based on a number of columns storing data of a compute cluster. A sensing circuit obtains the computation values from the computation circuits via the column multiplexer as the column multiplexer cycles through the data lines. The sensing circuit combines the obtained computation values over the determined number of multiplexer cycles. A first clock may initiate the multiplexer to cycle through its data lines for the determined number of multiplexer cycles, and a second clock may initiate each individual cycle. The multiplexer or additional circuitry may be utilized to modify the order in which data is written to the columns.
Abstract:
A memory management circuit stores information indicative of reliability-types of regions of a memory array. The memory management circuitry responds to a request to allocate memory in the memory array to a process by determining a request type associated with the request to allocate memory. Memory of the memory array is allocated to the process based on the request type associated with the request to allocate memory and the stored information indicative of reliability-types of regions of the memory array. The memory array may be a shared memory array. The memory array may be organized into rows and columns, and the regions of the memory array may be the rows of the memory array.
Abstract:
Adaptive scaling digital techniques attempt to place the system close to the timing failure so as to maximize energy efficiency. Rapid recovery from potential failures is usually by slowing the system clock and/or providing razor solutions (instruction replay.) These techniques compromise the throughput. This application presents a technique to provide local in-situ fault resilience based on dynamic slack borrowing. This technique is non-intrusive (needs no architecture modification) and has minimal impact on throughput.