Apparatus and method for vector multiply and accumulate of signed doublewords

    公开(公告)号:US10514923B2

    公开(公告)日:2019-12-24

    申请号:US15850180

    申请日:2017-12-21

    Abstract: An apparatus and method for performing signed multiplication of packed signed doublewords and accumulation with a signed quadword. For example, one embodiment of a processor comprises: a first source register to store a first plurality of packed signed doubleword data elements; a second source register to store a second plurality of packed signed doubleword data elements; a third source register to store a plurality of packed signed quadword data elements; execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to multiply first and second packed signed doubleword data elements from the first source register with third and fourth packed signed doubleword data elements from the second source register, respectively, to generate first and second temporary signed quadword products, the multiplier circuitry to select the first, second, third, and fourth signed doubleword data elements based on the opcode of the instruction; accumulation circuitry to combine the first temporary signed quadword product with a first packed signed quadword value read from the third source register to generate a first accumulated signed quadword result and to combine the second temporary signed quadword product with a second packed signed quadword value read from the third source register to generate a second accumulated signed quadword result; a destination register or the third source register to store the first accumulated signed quadword result in a first signed quadword data element position and to store the second accumulated signed quadword result in a second signed quadword data element position.

    FLOATING POINT TO FIXED POINT CONVERSION
    62.
    发明申请

    公开(公告)号:US20190199370A1

    公开(公告)日:2019-06-27

    申请号:US16291231

    申请日:2019-03-04

    Abstract: Embodiments of an instruction, its operation, and executional support for the instruction are described. In some embodiments, a processor comprises decode circuitry to decode an instruction having fields for an opcode, a packed data source operand identifier, and a packed data destination operand identifier; and execution circuitry to execute the decoded instruction to convert a single precision floating point data element of a least significant packed data element position of the identified packed data source operand to a fixed-point representation, store the fixed-point representation as 32-bit integer and a 32-bit integer exponent in the two least significant packed data element positions of the identified packed data destination operand, and zero of all remaining packed data elements of the identified packed data destination operand.

    APPARATUS AND METHOD FOR VECTOR HORIZONTAL ADD OF SIGNED/UNSIGNED WORDS AND DOUBLEWORDS

    公开(公告)号:US20190196823A1

    公开(公告)日:2019-06-27

    申请号:US15850131

    申请日:2017-12-21

    Abstract: An apparatus and method for performing a packed horizontal addition of words and doublewords. For example, one embodiment of a processor comprises: a decoder to decode a packed horizontal add instruction to generate a decoded packed horizontal add instruction, the packed horizontal add instruction including an opcode and operands identifying a plurality of packed words; a source register to store a first plurality of packed words; execution circuitry to execute the decoded instruction, the execution circuitry comprising: operand selection circuitry to identify first and second packed words from the source register in accordance with the operand and opcode of the packed horizontal add instruction; adder circuitry to add the first and second packed words to generate a temporary sum; a temporary storage of at least 17 bits to store the temporary sum; saturation circuitry to saturate the temporary sum if necessary to generate a final result; a destination register to store the final result as a packed result word in a designated data element position.

    Apparatus and method for shifting quadwords and extracting packed words

    公开(公告)号:US10318298B2

    公开(公告)日:2019-06-11

    申请号:US15721382

    申请日:2017-09-29

    Abstract: An apparatus and method for performing left-shifting operations on packed quadword data. For example, one embodiment of a processor comprises: a decoder to decode a left-shift instruction to generate a decoded left-shift instruction; a first source register to store a plurality of packed quadwords data elements; execution circuitry to execute the decoded left-shift instruction, the execution circuitry comprising shift circuitry to left-shift at least first and second packed quadword data elements from first and second packed quadword data element locations, respectively, in the first source register by an amount specified in an immediate value or in a control value in a second source register, to generate first and second left-shifted quadwords; the execution circuitry to cause selection of 16 most significant bits of the first and second left-shifted quadwords to be written to 16 least significant bit regions of first and second quadword data element locations, respectively, of a destination register; and the destination register to store the specified set of the 16 most significant bits of the first and second left-shifted quadwords.

    APPARATUS AND METHOD FOR COMPLEX MULTIPLY AND ACCUMULATE

    公开(公告)号:US20190163472A1

    公开(公告)日:2019-05-30

    申请号:US15824324

    申请日:2017-11-28

    Abstract: An embodiment of the invention is a processor including execution circuitry to calculate, in response to a decoded instruction, a result of a complex multiply-accumulate of a first complex number, a second complex number, and a third complex number. The calculation includes a first operation to calculate a first term of a real component of the result and a first term of the imaginary component of the result. The calculation also includes a second operation to calculate a second term of the real component of the result and a second term of the imaginary component of the result. The processor also includes a decoder to decode an instruction to generate the decoded instruction and a first source register, a second source register, and a source and destination register to provide the first complex number, the second complex number, and the third complex number, respectively.

    Floating point to fixed point conversion

    公开(公告)号:US10224954B1

    公开(公告)日:2019-03-05

    申请号:US15721573

    申请日:2017-09-29

    Abstract: Embodiments of an instruction, its operation, and executional support for the instruction are described. In some embodiments, a processor comprises decode circuitry to decode an instruction having fields for an opcode, a packed data source operand identifier, and a packed data destination operand identifier; and execution circuitry to execute the decoded instruction to convert a single precision floating point data element of a least significant packed data element position of the identified packed data source operand to a fixed-point representation, store the fixed-point representation as 32-bit integer and a 32-bit integer exponent in the two least significant packed data element positions of the identified packed data destination operand, and zero of all remaining packed data elements of the identified packed data destination operand.

Patent Agency Ranking