DISTRIBUTING MATRIX MULTIPLICATION PROCESSING AMONG PROCESSING NODES

    公开(公告)号:US20210374208A1

    公开(公告)日:2021-12-02

    申请号:US16886189

    申请日:2020-05-28

    Inventor: Aaron M. Collier

    Abstract: Based on a predetermined number of available processor sockets, a plurality of candidate matrix decompositions are identified, which correspond to a multiplication of matrices. Based on a first comparative relationship of a variation of first sizes of the plurality of candidate matrix decompositions along a first dimension and a second comparative relationship of a variation of second sizes of the plurality of candidate matrix decomposition sizes along a second dimension, a given candidate matrix decomposition is selected. Processing of the multiplication among the processor sockets is distributed based on the given candidate matrix decomposition.

    Distributing matrix multiplication processing among processing nodes

    公开(公告)号:US12061666B2

    公开(公告)日:2024-08-13

    申请号:US18189625

    申请日:2023-03-24

    Inventor: Aaron M. Collier

    CPC classification number: G06F17/16 G06F9/5066 G06F9/544

    Abstract: Based on a predetermined number of available processor sockets, a plurality of candidate matrix decompositions are identified, which correspond to a multiplication of matrices. Based on a first comparative relationship of a variation of first sizes of the plurality of candidate matrix decompositions along a first dimension and a second comparative relationship of a variation of second sizes of the plurality of candidate matrix decomposition sizes along a second dimension, a given candidate matrix decomposition is selected. Processing of the multiplication among the processor sockets is distributed based on the given candidate matrix decomposition.

    Assigning processing threads for matrix-matrix multiplication

    公开(公告)号:US11989257B2

    公开(公告)日:2024-05-21

    申请号:US17083373

    申请日:2020-10-29

    Inventor: Aaron M. Collier

    CPC classification number: G06F17/16

    Abstract: An apparatus includes a processor and a memory to store instructions. The instructions, when executed by the processor, cause the processor to perform threading of a first matrix along a first dimension of the first matrix and a second dimension of the matrix. The threading represents block sizes of the first matrix to assign to process threads of a multiplication algorithm to determine a third matrix that represents a product of the first matrix and a second matrix. The block sizes include a first block size along the first dimension and a second block size along the second dimension. The second matrix shares the second dimension with the first matrix. The instructions, when executed by the processor, cause the processor to provide data to the multiplication algorithm, which represents the first block size and the second block size.

    DISTRIBUTING MATRIX MULTIPLICATION PROCESSING AMONG PROCESSING NODES

    公开(公告)号:US20230281271A1

    公开(公告)日:2023-09-07

    申请号:US18189625

    申请日:2023-03-24

    Inventor: Aaron M. Collier

    CPC classification number: G06F17/16 G06F9/544 G06F9/5066

    Abstract: Based on a predetermined number of available processor sockets, a plurality of candidate matrix decompositions are identified, which correspond to a multiplication of matrices. Based on a first comparative relationship of a variation of first sizes of the plurality of candidate matrix decompositions along a first dimension and a second comparative relationship of a variation of second sizes of the plurality of candidate matrix decomposition sizes along a second dimension, a given candidate matrix decomposition is selected. Processing of the multiplication among the processor sockets is distributed based on the given candidate matrix decomposition.

    Distributing matrix multiplication processing among processing nodes

    公开(公告)号:US11640443B2

    公开(公告)日:2023-05-02

    申请号:US16886189

    申请日:2020-05-28

    Inventor: Aaron M. Collier

    Abstract: Based on a predetermined number of available processor sockets, a plurality of candidate matrix decompositions are identified, which correspond to a multiplication of matrices. Based on a first comparative relationship of a variation of first sizes of the plurality of candidate matrix decompositions along a first dimension and a second comparative relationship of a variation of second sizes of the plurality of candidate matrix decomposition sizes along a second dimension, a given candidate matrix decomposition is selected. Processing of the multiplication among the processor sockets is distributed based on the given candidate matrix decomposition.

    DISTRIBUTING MATRIX MULTIPLICATION PROCESSING AMONG PROCESSING NODES

    公开(公告)号:US20240320300A1

    公开(公告)日:2024-09-26

    申请号:US18734123

    申请日:2024-06-05

    Inventor: Aaron M. Collier

    CPC classification number: G06F17/16 G06F9/5066 G06F9/544

    Abstract: Based on a predetermined number of available processor sockets, a plurality of candidate matrix decompositions are identified, which correspond to a multiplication of matrices. Based on a first comparative relationship of a variation of first sizes of the plurality of candidate matrix decompositions along a first dimension and a second comparative relationship of a variation of second sizes of the plurality of candidate matrix decomposition sizes along a second dimension, a given candidate matrix decomposition is selected. Processing of the multiplication among the processor sockets is distributed based on the given candidate matrix decomposition.

    ASSIGNING PROCESSING THREADS FOR MATRIX-MATRIX MULTIPLICATION

    公开(公告)号:US20220138281A1

    公开(公告)日:2022-05-05

    申请号:US17083373

    申请日:2020-10-29

    Inventor: Aaron M. Collier

    Abstract: An apparatus includes a processor and a memory to store instructions. The instructions, when executed by the processor, cause the processor to perform threading of a first matrix along a first dimension of the first matrix and a second dimension of the matrix. The threading represents block sizes of the first matrix to assign to process threads of a multiplication algorithm to determine a third matrix that represents a product of the first matrix and a second matrix. The block sizes include a first block size along the first dimension and a second block size along the second dimension. The second matrix shares the second dimension with the first matrix. The instructions, when executed by the processor, cause the processor to provide data to the multiplication algorithm, which represents the first block size and the second block size.

Patent Agency Ranking