High performance portable convulational neural network library on GP-GPUs
Abstract:
Systems and methods are disclosed for speeding up a computer having a graphics processing unit (GPU) and a general purpose processor (GP-GPU) by decoupling a convolution process for a first matrix into a row part and a column part; expanding the row part into a second matrix; performing matrix multiplication using the second matrix and a filter matrix; and performing reduction on an output matrix.
Information query
Patent Agency Ranking
0/0