-
公开(公告)号:US20210390367A1
公开(公告)日:2021-12-16
申请号:US16901542
申请日:2020-06-15
Applicant: Arm Limited
Inventor: Zhi-Gang Liu , Paul Nicholas Whatmough , Matthew Mattina
Abstract: The present disclosure advantageously provides a matrix expansion unit that includes an input data selector, a first register set, a second register set, and an output data selector. The input data selector is configured to receive first matrix data in a columnwise format. The first register set is coupled to the input data selector, and includes a plurality of data selectors and a plurality of registers arranged in a first shift loop. The second register set is coupled to the data selector, and includes a plurality of data selectors and a plurality of registers arranged in a second shift loop. The output data selector is coupled to the first register set and the second register set, and is configured to output second matrix data in a rowwise format.
-
公开(公告)号:US20210374508A1
公开(公告)日:2021-12-02
申请号:US16885704
申请日:2020-05-28
Applicant: Arm Limited
Inventor: Paul Nicholas Whatmough , Zhi-Gang Liu , Matthew Mattina
Abstract: The present disclosure advantageously provides a pipelined accumulator that includes a data selector configured to receive a sequence of operands to be summed, an input register coupled to the data selector, an output register, coupled to the data selector, configured to store a sequence of partial sums and output a final sum, and a multi-stage add module coupled to the input register and the output register. The multi-stage add module is configured to store a sequence of partial sums and a final sum in a redundant format, and perform back-to-back accumulation into the output register.
-
公开(公告)号:US11120101B2
公开(公告)日:2021-09-14
申请号:US16585265
申请日:2019-09-27
Applicant: Arm Limited
Inventor: Zhi-Gang Liu , Matthew Mattina , Paul Nicholas Whatmough
Abstract: The present disclosure advantageously provides a system method for efficiently multiplying matrices with elements that have a value of 0. A bitmap is generated for each matrix. Each bitmap includes a bit position for each matrix element. The value of each bit is set to 0 when the value of the corresponding matrix element is 0, and to 1 when the value of the corresponding matrix element is not 0. Each matrix is compressed into a compressed matrix, which will have fewer elements with a value of 0 than the original matrix. Each bitmap is then adjusted based on the corresponding compressed matrix. The compressed matrices are then multiplied to generate an output matrix. For each element i,j in the output matrix, a dot product of the ith row of the first compressed matrix and the jth column of the second compressed matrix is calculated based on the bitmaps.
-
-