Abstract:
A vector-matrix multiplier unit fully utilizes a 128x128b data path for operand sizes from 8 to 128b and operand types including signed, unsigned or complex, and fixed-, floating-point, polynomial, or Galois-field while maintaining full internal precision. The present disclosure provides a system and method for improving the performance of general-purpose processor, by implementing a functional unit that computes the product of a matrix operand with a vector operand, producing a vector result. The functional unit fully utilizes the entire resources of a 128b by 128b multipliers regardless of the operand size, as the number of elements of the matrix and vector operands increase as operand size is reduced. The unit performs both fixed-point and floating-point multiplications and additions with the highest-possible intermediate accuracy with modest resources.
Abstract:
The present invention provides a system and method for improving the performance of general-purpose processors by implementing a functional unit that computes the product of a matrix operand with a vector operand, producing a vector result. The functional unit fully utilizes the entire resources of a 128b by 128b multipliers regardsless of the operand size, as the number of elements of the matrix and vector operands increase as operand size is reduced. The unit performs both fixed-point and floating-point multiplications and additions with the highest-possible intermediate accuracy with modest resources.