Abstract:
A method for controlling a memory from which data is transferred to a neural network processor and an apparatus thereof are provided, the method including: generating prefetch information of data by using a blob descriptor and a reference prediction table after history information is input; reading the data in the memory based on the pre-fetch information and temporarily archiving read data in a prefetch buffer; and accessing next data in the memory based on the prefetch information and temporarily archiving the next data in the prefetch buffer after the data is transferred to the neural network from the prefetch buffer.
Abstract:
An embodiment of the present invention provides a quantization method for weights of a plurality of batch normalization layers, including: receiving a plurality of previously learned first weights of the plurality of batch normalization layers; obtaining first distribution information of the plurality of first weights; performing a first quantization on the plurality of first weights using the first distribution information to obtain a plurality of second weights; obtaining second distribution information of the plurality of second weights; and performing a second quantization on the plurality of second weights using the second distribution information to obtain a plurality of final weights, and thereby reducing an error that may occur when quantizing the weight of the batch normalization layer.
Abstract:
A method and apparatus for multi-level stepwise quantization for neural network are provided. The apparatus sets a reference level by selecting a value from among values of parameters of the neural network in a direction from a high value equal to or greater than a predetermined value to a lower value, and performs learning based on the reference level. The setting of a reference level and the performing of learning are iteratively performed until the result of the reference level learning satisfies a predetermined value and there is no variable parameter that is updated during learning among the parameters.
Abstract:
Provided is a design method of a compressed neural network system. The method includes generating a compressed neural network based on an original neural network model, analyzing a sparse weight among kernel parameters of the compressed neural network, calculating a maximum possible calculation throughput on a target hardware platform according to a sparse property of the sparse weight, calculating a calculation throughput with respect to access to an external memory on the target hardware platform according to the sparse property, and determining a design parameter on the target hardware platform by referring the maximum possible calculation throughput and the calculation throughput with respect to access.
Abstract:
When precoding information corresponding to data items of respective layers to be transmitted is received from an upper layer, an encoding apparatus of a multiple input multiple output (MIMO) communication system selects a precoding matrix among a plurality of precoding matrices stored in a storage using the precoding information and precodes the data items of the respective layers by simple operations consisting of at least one operation combination of addition, subtraction, selection, and inversion operations in accordance with a kind of the selected precoding matrix and a precoding operation pattern.
Abstract:
Disclosed is a neural network computing device. The neural network computing device includes a neural network accelerator including an analog MAC, a controller controlling the neural network accelerator in one of a first mode and a second mode, and a calibrator that calibrating a gain and a DC offset of the analog MAC. The calibrator includes a memory storing weight data, calibration weight data, and calibration input data, a gain and offset calculator reading the calibration weight data and the calibration input data from the memory, inputting the calibration weight data and the calibration input data to the analog MAC, receiving calibration output data from the analog MAC, and calculating the gain and the DC offset of the analog MAC, and an on-device quantizer reading the weight data, receiving the gain and the DC offset, generating quantized weight data, based on the gain and the DC offset.
Abstract:
The neuromorphic arithmetic device comprises an input monitoring circuit that outputs a monitoring result by monitoring that first bits of at least one first digit of a plurality of feature data and a plurality of weight data are all zeros, a partial sum data generator that skips an arithmetic operation that generates a first partial sum data corresponding to the first bits of a plurality of partial sum data in response to the monitoring result while performing the arithmetic operation of generating the plurality of partial sum data, based on the plurality of feature data and the plurality of weight data, and a shift adder that generates the first partial sum data with a zero value and result data, based on second partial sum data except for the first partial sum data among the plurality of partial sum data and the first partial sum data generated with the zero value.
Abstract:
Provided is an artificial neural network device including pre-synaptic neurons configured to generate a plurality of input spike signals, and a post-synaptic neuron configured to receive the plurality of input spike signals and to generate an output spike signal during a plurality of time periods, wherein the post-synaptic neuron respectively applies different weights in the plurality of time periods according to contiguousness with a reference time period in which input spike signals, which lead generation of the output spike signal from among the plurality of input spike signals, are received.
Abstract:
Provided is a convolutional neural network system. The system includes an input buffer configured to store an input feature, a parameter buffer configured to store a learning parameter, a calculation unit configured to perform a convolution layer calculation or a fully connected layer calculation by using the input feature provided from the input buffer and the learning parameter provided from the parameter buffer, and an output buffer configured to store an output feature outputted from the calculation unit and output the stored output feature to the outside. The parameter buffer provides a real learning parameter to the calculation unit at the time of the convolution layer calculation and provides a binary learning parameter to the calculation unit at the time of the fully connected layer calculation.
Abstract:
Provided is a convolution neural network system including an image database configured to store first image data, a machine learning device configured to receive the first image data from the image database and generate synapse data of a convolution neural network including a plurality of layers for image identification based on the first image data, a synapse data compressor configured to compress the synapse data based on sparsity of the synapse data, and an image identification device configured to store the compressed synapse data and perform image identification on second image data without decompression of the compressed synapse data.