Abstract:
A binary CIM circuit enables all memory cells in a memory array to be effectively accessible simultaneously for computation using fixed pulse widths on the wordlines and equal capacitance on the bitlines. The fixed pulse widths and equal capacitance ensure that a minimum voltage drop in the bitline represents one least significant bit (LSB) so that the bitline voltage swing remains safely within the maximum allowable range. The binary CIM circuit maximizes the effective memory bandwidth of a memory array for a given maximum voltage range of bitline voltage.
Abstract:
A neuromorphic computing system is provided which comprises: a synapse core; and a pre-synaptic neuron, a first post-synaptic neuron, and a second post-synaptic neuron coupled to the synaptic core, wherein the synapse core is to: receive a request from the pre-synaptic neuron, generate, in response to the request, a first address of the first post-synaptic neuron and a second address of the second post-synaptic neuron, wherein the first address and the second address are not stored in the synapse core prior to receiving the request.
Abstract:
In one embodiment, a processor is to store a membrane potential of a neural unit of a neural network; and calculate, at a particular time-step of the neural network, a change to the membrane potential of the neural unit occurring over multiple time-steps that have elapsed since the last time-step at which the membrane potential was updated, wherein each of the multiple time-steps that have elapsed since the last time-step is associated with at least one input to the neural unit that affects the membrane potential of the neural unit.
Abstract:
Apparatus and method for configuring large numbers of fan-in and fan-out connections in a neuromorphic computer. For example, one embodiment of an apparatus comprises: a plurality of neurons, each neuron uniquely identifiable with a neuron identifier (ID); at least one memory to store neuron addresses with wildcard values to establish fan-in and/or fan-out connections between the neurons; and a router to translate at least one neuron address containing wildcard values into two or more neuron IDs to establish the fan-in and/or fan-out connections between the neurons.
Abstract:
A first pointer dereferencer receives a location of a portion of a first node of a data structure. The first node is to be stored in a first storage element. A first pointer is obtained from the first node of the data structure. A location of a portion of a second node of the data structure is determined based on the first pointer. The second node is to be stored in a second storage element. The location of the portion of the second node of the data structure is sent to a second pointer dereferencer that is to access the portion of the second node from the second storage element.
Abstract:
A packet-switched reservation request to be associated with a first data stream is received. A communication mode is selected. The communication mode is to be either a circuit-switched mode or a packet-switched mode. At least a portion of the first data stream is communicated in accordance with the communication mode.
Abstract:
A multicast message that is to originate from a source is received. The multicast message comprises an identifier. A plurality of directions in which the multicast message is to fork at the router are stored. A plurality of messages from the directions in which the multicast message is to fork are received. The received messages are to comprise the identifier. The plurality of messages are aggregated into an aggregate message and sent towards the source.
Abstract:
A first pointer dereferencer receives a location of a portion of a first node of a data structure. The first node is to be stored in a first storage element. A first pointer is obtained from the first node of the data structure. A location of a portion of a second node of the data structure is determined based on the first pointer. The second node is to be stored in a second storage element. The location of the portion of the second node of the data structure is sent to a second pointer dereferencer that is to access the portion of the second node from the second storage element.
Abstract:
A compute near memory (CNM) convolution accelerator enables a convolutional neural network (CNN) to use dedicated acceleration to achieve efficient in-place convolution operations with less impact on memory and energy consumption. A 2D convolution operation is reformulated as 1D row-wise convolution. The 1D row-wise convolution enables the CNM convolution accelerator to process input activations row-by-row, while using the weights one-by-one. Lightweight access circuits provide the ability to stream both weights and input rows as vectors to MAC units, which in turn enables modules of the CNM convolution accelerator to implement convolution for both [1×1] and chosen [n×n] sized filters.
Abstract:
Examples herein relate to a memory device comprising an eDRAM memory cell, the eDRAM memory cell can include a write circuit formed at least partially over a storage cell and a read circuit formed at least partially under the storage cell; a compute near memory device bonded to the memory device; a processor; and an interface from the memory device to the processor. In some examples, circuitry is included to provide an output of the memory device to emulate output read rate of an SRAM memory device comprises one or more of: a controller, a multiplexer, or a register. Bonding of a surface of the memory device can be made to a compute near memory device or other circuitry. In some examples, a layer with read circuitry can be bonded to a layer with storage cells. Any layers can be bonded together using techniques described herein.