High performance method and system for performing fault tolerant matrix multiplication

    公开(公告)号:US11080131B2

    公开(公告)日:2021-08-03

    申请号:US16010716

    申请日:2018-06-18

    摘要: A computer implemented method for performing fault tolerant numerical linear algebra computation task consisting of calculation steps that include at least classic or fast matrix multiplication, according to which, a controller splits the task among P processors, which operate in parallel. Additional processors are assigned according to execution and resources parameters, which are also used to select a slice-coded recovery algorithm or a posterior-recovery algorithm for executing the task. Pipelined-reduce operations are used to generate error correcting codes to protect the input blocks and outer products from faults. Upon detecting faults in one or more processors, if the slice-coded recovery algorithm has been selected, a slice-coded recovery algorithm is executed to recover lost input blocks and outer products that. If the posterior-recovery algorithm has been selected, error correcting codes are used for recovering lost input blocks and after the last step, recalculating outer products that correspond to faulty processors. In case when fast multiplication is needed, l DFS down-recursion steps are iteratively performed by the P processors and by the additional processors r times, for which the error correction codes will be valid, and after r times, recalculating the error correction codes for the next r times. Then by each processor of the P processors performs local block multiplication between a pair of blocks, while recalculating a new error correction code. Then the output matrix is created by iteratively performing d BFS up-recursion decoding steps on the multiplication product r times, the error correction codes will be valid only for the r times and after each group of r times, recalculating the error correction codes for the next r times, while at the end all iterations, blocks to be decoded obtaining and a code block that is held by the additional code processors, such that each processor holds a pair of blocks. Upon detecting faults in one or more processors, a recovery algorithm is executed, for recovering lost input blocks and multiplication results that correspond to faulty processors or correcting miscalculations of the processors by recalculation.

    Fast matrix multiplication and linear algebra by alternative basis

    公开(公告)号:US10387534B2

    公开(公告)日:2019-08-20

    申请号:US15823776

    申请日:2017-11-28

    摘要: A computerized method comprising operating one or more hardware processor for receiving a first matrix and a second matrix. The hardware processor(s) are operated for determining a basis transformation, wherein the basis transformation is invertible to an inverted basis transformation. The hardware processor(s) are operated for computing an alternative basis first matrix by multiplying the first matrix by the basis transformation. The hardware processor(s) are operated for computing an alternative basis second matrix by multiplying the second matrix by the basis transformation. The hardware processor(s) are operated for performing a matrix multiplication of the alternative basis first matrix and the alternative basis second matrix, thereby producing an alternative basis multiplied matrix. The hardware processor(s) are operated for computing a multiplied matrix by multiplying the alternative basis multiplied matrix by the inverted basis transformation.

    HIGH PERFORMANCE METHOD AND SYSTEM FOR PERFORMING FAULT TOLERANT MATRIX MULTIPLICATION

    公开(公告)号:US20180365099A1

    公开(公告)日:2018-12-20

    申请号:US16010716

    申请日:2018-06-18

    摘要: A computer implemented method for performing fault tolerant numerical linear algebra computation task consisting of calculation steps that include at least classic or fast matrix multiplication, according to which, a controller splits the task among P processors, which operate in parallel. Additional processors are assigned according to execution and resources parameters, which are also used to select a slice-coded recovery algorithm or a posterior-recovery algorithm for executing the task. Pipelined-reduce operations are used to generate error correcting codes to protect the input blocks and outer products from faults. Upon detecting faults in one or more processors, if the slice-coded recovery algorithm has been selected, a slice-coded recovery algorithm is executed to recover lost input blocks and outer products that. If the posterior-recovery algorithm has been selected, error correcting codes are used for recovering lost input blocks and after the last step, recalculating outer products that correspond to faulty processors. In case when fast multiplication is needed, I DFS down-recursion steps are iteratively performed by the P processors and by the additional processors r times, for which the error correction codes will be valid, and after r times, recalculating the error correction codes for the next r times. Then by each processor of the P processors performs local block multiplication between a pair of blocks, while recalculating a new error correction code. Then the output matrix is created by iteratively performing d BFS up-recursion decoding steps on the multiplication product r times, the error correction codes will be valid only for the r times and after each group of r times, recalculating the error correction codes for the next r times, while at the end all iterations, blocks to be decoded obtaining and a code block that is held by the additional code processors, such that each processor holds a pair of blocks. Upon detecting faults in one or more processors, a recovery algorithm is executed, for recovering lost input blocks and multiplication results that correspond to faulty processors or correcting miscalculations of the processors by recalculation.