COMPUTER PROCESSOR FOR HIGHER PRECISION COMPUTATIONS USING A MIXED-PRECISION DECOMPOSITION OF OPERATIONS
摘要:
Embodiments detailed herein relate to arithmetic operations of float-point values. An exemplary processor includes a plurality of cores to execute instructions. Each core of the plurality of cores comprises: a Level-1 (L1) instruction cache to store the instructions and an L1 data cache to store corresponding data; a plurality of vector registers to store a plurality of packed data elements, including single-precision floating-point data elements and reduced precision floating-point data elements having fewer mantissa bits than the single-precision floating point data elements and a same number of exponent bits as the single-precision floating point format data elements; and execution circuitry to execute an instruction to generate a dot product with a first pair of the reduced precision floating-point data elements and a corresponding second pair of the reduced precision floating-point data elements. The execution circuitry is to: generate a plurality of single precision floating-point products corresponding to the first pair and the second pair of the reduced precision floating-point data elements; and accumulate the plurality of single precision floating-point products to generate a single precision floating-point result data element.
信息查询
0/0