Apparatus and method for complex by complex conjugate multiplication

    公开(公告)号:US11256504B2

    公开(公告)日:2022-02-22

    申请号:US15721448

    申请日:2017-09-29

    Abstract: An apparatus and method for multiplying packed real and imaginary components of complex numbers are described. A processor embodiment includes: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed real and imaginary data elements; a second source register to store a second plurality of packed real and imaginary data elements; and execution circuitry to execute the decoded instruction. The execution circuitry includes: multiplier circuitry to select real and imaginary data elements in the first source register and second source, multiply each selected imaginary data element in the first source register with a selected real data element in the second source register, and multiply each selected real data element in the first source register with a selected imaginary data element in the second source register to generate a plurality of imaginary products; adder circuitry to add a first subset of the plurality of imaginary products and subtract a second subset of the plurality of imaginary products to generate a first temporary result, and to add a third subset of the plurality of imaginary products and subtract a fourth subset of the plurality of imaginary products to generate a second temporary result; and accumulation circuitry to combine the first temporary result with first data from a destination register to generate a first final result, combine the second temporary result with second data from the destination register to generate a second final result, and store the first final result and second final result back in the destination register.

    Apparatus and method for performing dual signed and unsigned multiplication of packed data elements

    公开(公告)号:US10977039B2

    公开(公告)日:2021-04-13

    申请号:US16672203

    申请日:2019-11-01

    Abstract: An apparatus and method for performing dual concurrent multiplications of packed data elements. For example one embodiment of a processor comprises: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed doubleword data elements; a second source register to store a second plurality of packed doubleword data elements; and execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to multiply a first doubleword data element from the first source register with a second doubleword data element from the second source register to generate a first quadword product and to concurrently multiply a third doubleword data element from the first source register with a fourth doubleword data element from the second source register to generate a second quadword product; and a destination register to store the first quadword product and the second quadword product as first and second packed quadword data elements.

    Systems, apparatuses and methods for dual complex by complex conjugate multiply of signed words

    公开(公告)号:US10664277B2

    公开(公告)日:2020-05-26

    申请号:US15721313

    申请日:2017-09-29

    Abstract: Embodiments of systems, apparatuses, and methods for dual complex number by complex conjugate multiplication in a processor are described. For example, execution circuitry executes a decoded instruction to multiplex data values from a plurality of packed data element positions in the first and second packed data source operands to at least one multiplier circuit, the first and second packed data source operands including a plurality of pairs complex numbers, each pair of complex numbers including data values at shared packed data element positions in the first and second packed data source operands; calculate a real part and an imaginary part of a product of a first complex number and a complex conjugate of a second complex number; and store the real result to a first packed data element position in the destination operand and store the imaginary result to a second packed data element position in the destination operand.

    Apparatus and method for complex by complex conjugate multiplication

    公开(公告)号:US11755323B2

    公开(公告)日:2023-09-12

    申请号:US17672504

    申请日:2022-02-15

    CPC classification number: G06F9/30036 G06F9/3001 G06F9/30105

    Abstract: An apparatus and method for multiplying packed real and imaginary components of complex numbers are described. A processor embodiment includes: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed real and imaginary data elements; a second source register to store a second plurality of packed real and imaginary data elements; and execution circuitry to execute the decoded instruction. The execution circuitry includes: multiplier circuitry to select real and imaginary data elements in the first source register and second source, multiply each selected imaginary data element in the first source register with a selected real data element in the second source register, and multiply each selected real data element in the first source register with a selected imaginary data element in the second source register to generate a plurality of imaginary products; adder circuitry to add a first subset of the plurality of imaginary products and subtract a second subset of the plurality of imaginary products to generate a first temporary result, and to add a third subset of the plurality of imaginary products and subtract a fourth subset of the plurality of imaginary products to generate a second temporary result; and accumulation circuitry to combine the first temporary result with first data from a destination register to generate a first final result, combine the second temporary result with second data from the destination register to generate a second final result, and store the first final result and second final result back in the destination register.

    Floating point to fixed point conversion

    公开(公告)号:US10763891B2

    公开(公告)日:2020-09-01

    申请号:US16291231

    申请日:2019-03-04

    Abstract: Embodiments of an instruction, its operation, and executional support for the instruction are described. In some embodiments, a processor comprises decode circuitry to decode an instruction having fields for an opcode, a packed data source operand identifier, and a packed data destination operand identifier; and execution circuitry to execute the decoded instruction to convert a single precision floating point data element of a least significant packed data element position of the identified packed data source operand to a fixed-point representation, store the fixed-point representation as 32-bit integer and a 32-bit integer exponent in the two least significant packed data element positions of the identified packed data destination operand, and zero of all remaining packed data elements of the identified packed data destination operand.

    Apparatus and method for multiplying, summing, and accumulating sets of packed bytes

    公开(公告)号:US10705839B2

    公开(公告)日:2020-07-07

    申请号:US15850499

    申请日:2017-12-21

    Abstract: A processor having a decoder to decode an instruction to generate a decoded instruction; a first source register to store a first plurality of packed signed bytes; a second source register to store a second plurality of packed signed bytes; execution circuitry to execute the decoded instruction, the execution circuitry including: multiplier circuitry to multiply each packed signed byte from the first source register with a corresponding packed signed byte from the second source register to generate temporary products, adder circuitry to add a plurality of sets of the temporary products to generate a plurality of temporary sums; negation and extension circuitry to negate and extend each of the temporary sums to doublewords sums; and accumulation circuitry to add each of the doublewords sums to a doubleword from a third source register to generate final doubleword results; and a packed data destination register to store the final doubleword results.

    APPARATUS AND METHOD FOR VECTOR HORIZONTAL ADD OF SIGNED/UNSIGNED WORDS AND DOUBLEWORDS

    公开(公告)号:US20220236991A1

    公开(公告)日:2022-07-28

    申请号:US17671356

    申请日:2022-02-14

    Abstract: An apparatus and method for performing a packed horizontal addition of words and doublewords. One embodiment of a processor includes a decoder to decode a packed horizontal add instruction which includes an opcode and one or more operands used to identify a plurality of packed words; a source register to store a plurality of packed words; execution circuitry to execute the decoded instruction, and a destination register to store a final result as a packed result word in a designated data element position. The execution circuitry includes operand selection circuitry to identify first and second packed words from the source register in accordance with the operands and opcode; adder circuitry to add the two packed words to generate a temporary sum; a temporary storage of at least 17 bits to store the temporary sum; and saturation circuitry to saturate the temporary sum if necessary to generate the final result.

    Apparatus and method for multiply, add/subtract, and accumulate of packed data elements

    公开(公告)号:US11074073B2

    公开(公告)日:2021-07-27

    申请号:US15721225

    申请日:2017-09-29

    Abstract: An apparatus and method for performing dual concurrent multiplications, subtraction/addition, and accumulation of packed data elements. For example one embodiment of a processor comprises: a decoder to decode an instruction to generate a decoded instruction; a first source register to store first and second packed data elements; a second source register to store third and fourth packed data elements; execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to multiply the first and third packed data elements to generate a first temporary product and to concurrently multiply the second and fourth packed data elements to generate a second temporary product, the first through fourth packed data elements all being a first width; circuitry to negate the first temporary product to generate a negated first product; adder circuitry to add the first negated product to a first accumulated packed data element from a third source register to generate a first result, the first result being a second width which is at least twice as large as the first width; the adder circuitry to concurrently add the second temporary product to a second accumulated packed data element to generate a second result of the second width; the first and second results to be stored in specified first and second data element positions within a destination register.

    Apparatus and method for multiplication and accumulation of complex and real packed data elements

    公开(公告)号:US10795676B2

    公开(公告)日:2020-10-06

    申请号:US15721459

    申请日:2017-09-29

    Abstract: An apparatus and method for multiplying packed real and imaginary components of complex numbers. For example, one embodiment of a processor comprises: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed real and imaginary data elements; a second source register to store a second plurality of packed real and imaginary data elements; execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to select real and imaginary data elements in the first source register and second source register to multiply, the multiplier circuitry to multiply each selected imaginary data element in the first source register with a selected real data element in the second source register, and to multiply each selected real data element in the first source register with a selected imaginary data element in the second source register to generate a plurality of imaginary products, adder circuitry to add a first subset of the plurality of imaginary products to generate a first temporary result and to add a second subset of the plurality of imaginary products to generate a second temporary result; accumulation circuitry to combine the first temporary result with first data from a destination register to generate a first final result and to combine the second temporary result with second data from the destination register to generate a second final result and to store the first final result and second final result back in the destination register.

Patent Agency Ranking