APPARATUS AND METHOD FOR INSTRUCTION-BASED FLOP ACCOUNTING

    公开(公告)号:US20180189065A1

    公开(公告)日:2018-07-05

    申请号:US15396345

    申请日:2016-12-30

    CPC classification number: G06F9/30014 G06F9/30065 G06F9/30072 G06F11/30

    Abstract: An apparatus and method are described for floating point operation (FLOP) accounting. For example, one embodiment of a processor comprises: an instruction fetch unit to fetch instructions from system memory, the instructions including at least one masked vector floating point instruction to perform operations on a plurality of floating point data elements; a mask register to store a mask value associated with the masked vector floating point instruction; a decoder to decode the masked vector floating point instruction; and floating point operations (FLOP) accounting circuitry to read the mask register to determine a number of floating point operations to be performed during execution of the masked vector floating point instruction.

    APPARATUS AND METHOD FOR LOOP FLATTENING AND REDUCTION IN A SINGLE INSTRUCTION MULTIPLE DATA (SIMD) PIPELINE

    公开(公告)号:US20220100509A1

    公开(公告)日:2022-03-31

    申请号:US17493667

    申请日:2021-10-04

    Abstract: An apparatus and method for loop flattening and reduction in a SIMD pipeline including broadcast, move, and reduction instructions. For example, one embodiment of a processor comprises: a decoder to decode a broadcast instruction to generate a decoded broadcast instruction identifying a plurality of operations, the broadcast instruction including an opcode, first and second source operands, and at least one destination operand, the broadcast instruction having a split value associated therewith; a first source register associated with the first source operand to store a first plurality of packed data elements; a second source register associated with the second source operand to store a second plurality of packed data elements; execution circuitry to execute the operations of the decoded broadcast instruction, the execution circuitry to copy a first number of contiguous data elements from the first source register to a first set of contiguous data element locations in a destination register specified by the destination operand, the execution circuitry to further copy a second number of contiguous data elements from the second source register to a second set of contiguous data element locations in the destination register, wherein the execution circuitry is to determine the first number and the second number in accordance with the split value associated with the broadcast instruction.

    APPARATUSES, METHODS, AND SYSTEMS FOR MIXING VECTOR OPERATIONS

    公开(公告)号:US20180088946A1

    公开(公告)日:2018-03-29

    申请号:US15277963

    申请日:2016-09-27

    Abstract: Systems, methods, and apparatuses relating to mixing vector operations are described. In one embodiment, a processor includes a decoder to decode an instruction; and an execution unit to execute the decoded instruction to: receive a first input operand of a first data vector, a second input operand of a second data vector, and a third input operand of a control value vector, perform a first operation on data in a same element position of the first and second data vectors for each same element position of the control value vector having a first control value, perform a second, different operation on data in a same element position of the first and second data vectors for each same element position of the control value vector having a second, different control value, and output results from each first operation and each second operation into each corresponding element position in an output vector.

Patent Agency Ranking