SMART THREADING IN MATRIX MULTIPLICATION
    1.
    发明公开

    公开(公告)号:US20240320293A1

    公开(公告)日:2024-09-26

    申请号:US18125454

    申请日:2023-03-23

    CPC classification number: G06F17/16 G06F9/4881

    Abstract: Techniques are described in which an estimated optimal thread quantity for matrix multiplication is determined and implemented based on dimensions of the input matrices being multiplied and one or more kernel parameters that vary based on processor architecture. An efficient factorization of the estimated optimal thread quantity is based on a number of blocks along a first dimension of a first input matrix, and a number of blocks along a dimension n of a second input matrix B, with both numbers being based on the kernel parameters. In certain embodiments, a command processor of a parallel processor determines an estimated optimal thread quantity for performing a matrix multiplication command responsive to receiving the matrix multiplication command, and then schedules that estimated optimal thread quantity of kernel threads to execute the matrix multiplication command in parallel.

Patent Agency Ranking