Method and apparatus for efficient binary and ternary support in fused multiply-add (FMA) circuits

    公开(公告)号:US11366636B2

    公开(公告)日:2022-06-21

    申请号:US16919022

    申请日:2020-07-01

    Abstract: An apparatus and method for efficiently performing a multiply add or multiply accumulate operation. For example, one embodiment of a processor comprises: a decoder to decode an instruction specifying an operation, the instruction comprising a first operand identifying a multiplier and a second operand identifying a multiplicand; and fused multiply-add (FMA) execution circuitry comprising first multiplication circuitry to perform a multiplication using the multiplicand and multiplier to generate a result for multipliers and multiplicands falling within a first precision range, and second multiplication circuitry to be used instead of the first multiplication circuitry for multipliers and multiplicands falling within a second precision range.

    METHOD AND APPARATUS FOR EFFICIENT BINARY AND TERNARY SUPPORT IN FUSED MULTIPLY-ADD (FMA) CIRCUITS

    公开(公告)号:US20220342641A1

    公开(公告)日:2022-10-27

    申请号:US17839905

    申请日:2022-06-14

    Abstract: An apparatus and method for efficiently performing a multiply add or multiply accumulate operation. For example, one embodiment of a processor comprises: a decoder to decode an instruction specifying an operation, the instruction comprising a first operand identifying a multiplier and a second operand identifying a multiplicand; and fused multiply-add (FMA) execution circuitry comprising first multiplication circuitry to perform a multiplication using the multiplicand and multiplier to generate a result for multipliers and multiplicands falling within a first precision range, and second multiplication circuitry to be used instead of the first multiplication circuitry for multipliers and multiplicands falling within a second precision range.

    Apparatuses, methods, and systems for fast fourier transform configuration and computation instructions

    公开(公告)号:US10942985B2

    公开(公告)日:2021-03-09

    申请号:US16236464

    申请日:2018-12-29

    Abstract: Systems, methods, and apparatuses relating to performing fast Fourier transform (FFT) configuration and computation operations are described. In one embodiment, a processor includes a matrix operations accelerator circuit that includes a two-dimensional grid of processing element circuits; a first plurality of registers that represents a first two-dimensional matrix coupled to the matrix operations accelerator circuit; a second plurality of registers that represents a second two-dimensional matrix coupled to the matrix operations accelerator circuit; a decoder, of a core coupled to the matrix operations accelerator circuit, to decode a single instruction into a decoded single instruction; and an execution circuit of the core to execute the decoded single instruction to cause the two-dimensional grid of processing element circuits to operate on a first packed data input value and a first complex twiddle factor value to produce a first result and a second result.

Patent Agency Ranking