摘要:
Techniques are described for efficiently reducing the amount of total computation in convolutional neural networks (CNNs) without affecting the output result or classification accuracy. Computation redundancy in CNNs is reduced by exploiting the computing nature of the convolution and subsequent pooling (e.g., sub-sampling) operations. In some implementations, the input features may be divided into a group of precision values and the operation(s) may be cascaded. A maximum may be identified (e.g., by 90% probability) using a small number of bits in the input features, and the full-precision convolution may then be performed on the maximum input. Accordingly, the total number of bits used to perform the convolution is reduced without affecting the output features or the final classification accuracy.
摘要:
Hardware noise-aware training for improving accuracy of in-memory computing (IMC)-based deep neural network (DNN) hardware is provided. DNNs have been very successful in large-scale recognition tasks, but they exhibit large computation and memory requirements. To address the memory bottleneck of digital DNN hardware accelerators, IMC designs have been presented to perform analog DNN computations inside the memory. Recent IMC designs have demonstrated high energy-efficiency, but this is achieved by trading off the noise margin, which can degrade the DNN inference accuracy. The present disclosure proposes hardware noise-aware DNN training to largely improve the DNN inference accuracy of IMC hardware. During DNN training, embodiments perform noise injection at the partial sum level, which matches with the crossbar structure of IMC hardware, and the injected noise data is directly based on measurements of actual IMC prototype chips.
摘要:
A resistive random-access memory (RRAM) system includes an RRAM cell. The RRAM cell includes a first select line and a second select line, a word line, a bit line, a first resistive memory device, a first switching device, a second resistive memory device, a second switching device, and a comparator. The first resistive memory device is coupled between a first access node and the bit line. The first switching device is coupled between the first select line and the first access node. The second resistive memory device is coupled between a second access node and the bit line. The second switching device is coupled between the second select line and the second access node. The comparator includes a first input coupled to the bit line, a second input, and an output.
摘要:
An apparatus and method are described for a neuromorphic processor design in which neuron timing information is duplicated on a neuromorphic core. For example, one embodiment of an apparatus comprises: a first neurosynaptic core comprising a plurality of neurons and a synapse array comprising a plurality of synapses to communicatively couple the plurality of neurons, each synapse connecting two neurons having a weight associated therewith, wherein a first neuron is to generate an output spike based on the weights of synapses over which inputs are received from the other neurons; a second neurosynaptic core also comprising a plurality of neurons and having at least one counter to maintain a count value indicative of spike timing for a second neuron, wherein a spike output of the second neuron in the second neurosynaptic core is communicatively coupled over a first synapse to the first neuron in the first neurosynaptic core; and a duplicate counter maintained within the first neurosynaptic core and synchronized with the counter from the second neurosynaptic core, the first neuron to use a first value from the duplicate counter to adjust the weight of the first synapse coupling the second neuron to the first neuron.
摘要:
An apparatus and method are described for a neuromorphic processor design in which neuron timing information is duplicated on a neuromorphic core. For example, one embodiment of an apparatus comprises: a first neurosynaptic core comprising a plurality of neurons and a synapse array comprising a plurality of synapses to communicatively couple the plurality of neurons, each synapse connecting two neurons having a weight associated therewith, wherein a first neuron is to generate an output spike based on the weights of synapses over which inputs are received from the other neurons; a second neurosynaptic core also comprising a plurality of neurons and having at least one counter to maintain a count value indicative of spike timing for a second neuron, wherein a spike output of the second neuron in the second neurosynaptic core is communicatively coupled over a first synapse to the first neuron in the first neurosynaptic core; and a duplicate counter maintained within the first neurosynaptic core and synchronized with the counter from the second neurosynaptic core, the first neuron to use a first value from the duplicate counter to adjust the weight of the first synapse coupling the second neuron to the first neuron.
摘要:
Neuromorphic computational circuitry is disclosed that includes a cross point resistive network and line control circuitry. The cross point resistive network includes variable resistive units. One set of the variable resistive units is configured to generate a correction line current on a conductive line while other sets of the variable resistive units generate resultant line currents on other conductive lines. The line control circuitry is configured to receive the line currents from the conductive lines and generate digital vector values. Each of the digital vector values is provided in accordance with a difference between the current level of a corresponding resultant line current and a current level of the correction line current. In this manner, the digital vector values are corrected by the current level of the correction line current in order to reduce errors resulting from finite on to off conductance state ratios.
摘要:
A resistive random-access memory (RRAM) system includes an RRAM cell. The RRAM cell includes a first select line and a second select line, a word line, a bit line, a first resistive memory device, a first switching device, a second resistive memory device, a second switching device, and a comparator. The first resistive memory device is coupled between a first access node and the bit line. The first switching device is coupled between the first select line and the first access node. The second resistive memory device is coupled between a second access node and the bit line. The second switching device is coupled between the second select line and the second access node. The comparator includes a first input coupled to the bit line, a second input, and an output.
摘要:
An information processing system, which includes a control system and an artificial neural network, is disclosed. The artificial neural network includes a group of neurons and a group of synapses, which includes a first portion and a second portion. The control system selects one of a group of operating modes. The group of neurons processes information. The group of synapses provide connectivity to each of the group of neurons. During a first operating mode of the group of operating modes, the first portion of the group of synapses is enabled and the second portion of the group of synapses is enabled. During a second operating mode of the group of operating modes, the first portion of the group of synapses is enabled and the second portion of the group of synapses is disabled.
摘要:
This disclosure relates generally to resistive memory systems. The resistive memory systems may be utilized to implement neuro-inspired learning algorithms with full parallelism. In one embodiment, a resistive memory system includes a cross point resistive network and switchable paths. The cross point resistive network includes variable resistive elements and conductive lines. The conductive lines are coupled to the variable resistive elements such that the conductive lines and the variable resistive elements form the cross point resistive network. The switchable paths are connected to the conductive lines so that the switchable paths are operable to selectively interconnect groups of the conductive lines such that subsets of the variable resistive elements each provide a combined variable conductance. With multiple resistive elements in the subsets, process variations in the conductances of the resistive elements average out. As such, learning algorithms may be implemented with greater precision using the cross point resistive network.
摘要:
A method includes: delaying an excursion of at least one signal a first number of clock phases when the excursion departs from a value in a first direction; and delaying the excursion of the at least one signal a second number of the clock phases when the excursion departs toward the value in a second direction. The first number of clock phases is different from the second number of clock phases. The at least one signal effects a plurality of succeeding excursions in substantial synchrony with a clocked signal presenting succeeding clock cycles having a plurality of the clock phases in each respective clock cycle.