SYSTEMS AND METHODS FOR ACCELERATED NEURAL-NETWORK CONVOLUTION AND TRAINING
Abstract:
An application-specific integrated circuit for an artificial neural network is integrated with a high-bandwidth memory. The neural network includes a systolic array of interconnected processing elements, including upstream processing elements and downstream processing elements. Each processing element includes input/output port pairs for concurrent forward and back propagation. The processing elements can be used for convolution, in which case the input/output port pairs can support the fast and efficient scanning of kernels relative to activations.
Information query
Patent Agency Ranking
0/0