-
1.
公开(公告)号:US20240004954A1
公开(公告)日:2024-01-04
申请号:US17978439
申请日:2022-11-01
发明人: Zhaoyang DU , Yijin GUAN , Dimin NIU , Hongzhong ZHENG
CPC分类号: G06F17/16 , G06F7/32 , G06F7/4876
摘要: This application describes an hardware acceleration design for improving SpGEMM efficiency. An exemplary method may include: obtaining a first sparse matrix and a second sparse matrix for performing SpGEMM; allocating a pair of buffers respectively pointed by a first pointer and a second pointer; for each first row in the first sparse matrix that comprises a plurality of non-zero elements, identifying a plurality of second rows in the second sparse matrix that correspond to the plurality of non-zero elements; obtaining a plurality of intermediate lists computed based on each of the plurality of non-zero elements in the first row and one of the plurality of second rows that corresponds to the non-zero element; performing accumulation of the intermediate lists using the pair of buffers; and migrating the one final merged list to a system memory as a row of an output matrix of the SpGEMM.
-
公开(公告)号:US20240184848A1
公开(公告)日:2024-06-06
申请号:US18309826
申请日:2023-04-30
发明人: Zhaoyang DU , Yijin GUAN , Dimin NIU , Tianchan GUAN , Hongzhong ZHENG
CPC分类号: G06F17/16 , G06F7/4876
摘要: This application describes accelerator, computer system, and method for memory allocation in sparse matrix-matrix multiplications (spGEMM). An example method may include: computing a number of floating point multiplication operations (FLOP) to be performed to generate each row in the output matrix; determining an estimated compression ratio based on a plurality of first rows sampled from the first sparse matrix and a plurality of corresponding second rows from the second sparse matrix; determining an estimated number of non-zero data (NNZ) in each row of the to-be-generated output matrix; constructing a plurality of hash tables for the rows in the to-be-generated output matrix based on the estimated NNZ corresponding to each row; performing symbolic computations between the first and second sparse matrices by using the hash tables to determine actual NNZs in the to-be-generated output matrix; and allocating a memory space for the output matrix based on the actual NNZs.
-
3.
公开(公告)号:US20240004955A1
公开(公告)日:2024-01-04
申请号:US17984230
申请日:2022-11-09
发明人: Zhaoyang DU , Yijin GUAN , Dimin NIU , Hongzhong ZHENG
CPC分类号: G06F17/16 , G06F7/4876 , G06F9/5016
摘要: This application describes an accelerator, a computer system, and a method for memory optimization in sparse matrix-matrix multiplications (spGEMM). The memory optimization includes accurate memory pre-allocation for a to-be-generated output matrix of spGEMM between two sparse matrices. An exemplary method may include: sampling a plurality of first rows in the first sparse matrix; identifying, based on indices of non-zero data in the plurality of first rows, a plurality of second rows in a second sparse matrix; performing symbolic multiplication operations between the non-zero data in the plurality of first and second rows; determining an estimated compression ratio of the output matrix; determining an estimated mean row size for each row in the output matrix based on the estimated compression ratio; and allocating, according to the estimated mean row size and a total number of rows of the output matrix, a memory space in a hardware memory.
-
-