Abstract:
An ultra-deep compute Static Random Access Memory (SRAM) with high compute throughput and multi-directional data transfer capability is provided. Compute units are placed in both horizontal and vertical directions to achieve a symmetric layout while enabling communication between the compute units. An SRAM array supports simultaneous read and write to the left and right section of the same SRAM subarray by duplicating pre-decoding logic inside the SRAM array. This allows applications with non-overlapping read and write address spaces to have twice the bandwidth as compared to a baseline SRAM array.
Abstract:
Embodiments include apparatuses, methods, and systems for a flip-flop circuit with low-leakage transistors. The flip-flop circuit may be coupled to a logic circuit of an integrated circuit to store data for the logic circuit when the logic circuit is in a sleep state. The flip-flop circuit may pass a data signal for the logic circuit along a signal path. A capacitor may be coupled between the signal path and ground to store a value of the data signal when the logic circuit is in the sleep state. A low-leakage transistor, such as an IGZO transistor, may be coupled between the capacitor and the signal path and may selectively turn on when the logic circuit transitions from the active state to the sleep state to store the value of the data signal in the capacitor. Other embodiments may be described and claimed.
Abstract:
An apparatus is provided which comprises: a first power gate transistor coupled to an ungated power supply node and a gated power supply node, the first power gate transistor having a gate terminal controllable by a first logic; and a second power gate coupled to the ungated power supply node and the gated power supply node, the second power gate transistor having a gate terminal controllable by a second logic, wherein the first power gate transistor is larger than the second power gate transistor, and wherein the second logic is operable to: weakly turn on the second power gate, fully turn on the second power gate, turn off the second power gate, and connecting the second power gate as diode.
Abstract:
Apparatuses, methods and storage media associated with single-ended sensing array design are disclosed herein. In embodiments, a memory device may include bitcell arrays, clipper circuitry, read merge circuitry, and a set dominant latch (SDL). The clipper circuitry may be coupled to a read port node of a first bitcell array of the bitcell arrays and a local bitline (LBL) node, the clipper circuitry to provide a voltage drop between the read port node and the LBL node. The read merge circuitry coupled to the clipper circuitry at the LBL node, the read merge circuitry to drive a value of a global bitline (GBL) node based on a value of the LBL node. The SDL coupled to the GBL node to sense the value of the GBL node. Other embodiments may be described and/or claimed.
Abstract:
A processor or integrated circuit includes a memory to store weight values for a plurality neuromorphic states and a circuitry coupled to the memory. The circuitry is to detect an incoming data signal for a pre-synaptic neuromorphic state and initiate a time window for the pre-synaptic neuromorphic state in response to detecting the incoming data signal. The circuitry is further to, responsive to detecting an end of the time window: retrieve, from the memory, a weight value for a post-synaptic neuromorphic state for which an outgoing data signal is generated during the time window, the post-synaptic neuromorphic state being a fan-out connection of the pre-synaptic neuromorphic state; perform a causal update to the weight value, according to a learning function, to generate an updated weight value; and store the updated weight value back to the memory.
Abstract:
Systems, apparatuses and methods may provide a hybrid compression scheme to store synaptic weights in neuromorphic cores. The hybrid compression scheme utilizes a run-length encoding (RLE) compression approach, a dictionary-based encode compression scheme, and a compressionless encoding scheme to store the weights for valid synaptic connections in a synaptic weight memory.
Abstract:
Voltage regulation of processor sub-domains supplied by a same voltage domain power supply rail. Voltage to certain logic units within the voltage domain may be reduced relative to other logic units of the voltage domain, reducing idle time at high power. In an embodiment, a first voltage-regulated sub-domain includes at least one execution unit (EU) while a second voltage-regulated sub-domain includes at least one texture sampler to provide flexibility in setting the graphics core power-performance point beyond modulating active EU count through power domain (gating) control. In embodiments, a sub-domain voltage is regulated by an on-chip DLDO for fast voltage switching. Clock frequency and sub-domain voltage may be switched faster than the voltage of the voltage domain supply rail, permitting a more finely grained power management that can be responsive to EU workload demand.
Abstract:
Described is an apparatus which comprises: a first power supply node to provide a first power supply; a second power supply node to provide a second power supply; a driver to operate on the first power supply, the driver to generate an output; and a receiver to operate on the second power supply, the receiver to receive the output from the driver and to generate a level-shifted output such that the receiver is operable to steer current from the second power supply to the first power supply.
Abstract:
An apparatus includes a first write bit line (WBL), a first P-channel metal oxide semiconductor (PMOS) transistor including a source coupled to the WBL, a first inverter including an input coupled to a drain of the first PMOS transistor, and a second PMOS transistor including a source coupled to an output of the first inverter. The first PMOS transistor and the second PMOS transistor are disposed in at least one PMOS layer configured between a first metal layer and a second metal layer. The register file circuit further includes a first via connecting a gate of the first PMOS transistor and a gate of the second PMOS transistor in the at least one PMOS layer to the first metal layer.