Invention Application
US20170046307A1 RUNTIME OF CUBLAS MATRIX MULTIPLICATION ON GPU 有权
GPU上的CUBLAS矩阵多项式运行

RUNTIME OF CUBLAS MATRIX MULTIPLICATION ON GPU
Abstract:
Methods for improving matrix multiplication runtimes are provided. A method includes determining, by a GPU, optimal partitions for matrix-by-matrix multiplication of two factor matrices having sizes known a priori. The determining step includes performing offline a plurality of matrix-by-matrix multiplication executions, each for a respective different combination of two-way partitions across a plurality of partition sizes. The determining step further includes determining offline a respective performance number for each of the executions based on runtime. The determining step also includes recursively repeating offline said performing and determining steps until the respective performance number ceases to improve for best-performing combinations of the two-way partitions and saving the best performing combinations of the two-way partitions as the optimal partitions. The method further includes performing online, by the GPU, the matrix-by-matrix multiplication of the two factor matrices using calls for a given one of the best performing combinations of the two-way partitions.
Public/Granted literature
Information query
Patent Agency Ranking
0/0