Patent search ap:("Intel Corporation") AND inv:"Venkateswara Madduri" Page 3

21.

发明授权
Apparatus and method for shifting quadwords and extracting packed words 有权

公开(公告)号：US10318298B2

公开(公告)日：2019-06-11

申请号：US15721382

申请日：2017-09-29

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Mark Charney , Jesus Corbal

IPC: G06F9/30

Abstract: An apparatus and method for performing left-shifting operations on packed quadword data. For example, one embodiment of a processor comprises: a decoder to decode a left-shift instruction to generate a decoded left-shift instruction; a first source register to store a plurality of packed quadwords data elements; execution circuitry to execute the decoded left-shift instruction, the execution circuitry comprising shift circuitry to left-shift at least first and second packed quadword data elements from first and second packed quadword data element locations, respectively, in the first source register by an amount specified in an immediate value or in a control value in a second source register, to generate first and second left-shifted quadwords; the execution circuitry to cause selection of 16 most significant bits of the first and second left-shifted quadwords to be written to 16 least significant bit regions of first and second quadword data element locations, respectively, of a destination register; and the destination register to store the specified set of the 16 most significant bits of the first and second left-shifted quadwords.

22.

发明授权
Floating point to fixed point conversion 有权

公开(公告)号：US10224954B1

公开(公告)日：2019-03-05

申请号：US15721573

申请日：2017-09-29

Applicant: INTEL CORPORATION

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Mark Charney

IPC: G06F9/30 , H03M7/24 , H03M7/40 , H03M7/42

Abstract: Embodiments of an instruction, its operation, and executional support for the instruction are described. In some embodiments, a processor comprises decode circuitry to decode an instruction having fields for an opcode, a packed data source operand identifier, and a packed data destination operand identifier; and execution circuitry to execute the decoded instruction to convert a single precision floating point data element of a least significant packed data element position of the identified packed data source operand to a fixed-point representation, store the fixed-point representation as 32-bit integer and a 32-bit integer exponent in the two least significant packed data element positions of the identified packed data destination operand, and zero of all remaining packed data elements of the identified packed data destination operand.

23.

发明授权
Dual sum of quadword 16×16 multiply and accumulate 有权

公开(公告)号：US12204903B2

公开(公告)日：2025-01-21

申请号：US17359522

申请日：2021-06-26

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Cristina Anderson , Robert Valentine , Mark Charney , Vedvyas Shanbhogue

IPC: G06F9/30

Abstract: Techniques for matrix multiplication are described. In some examples, a single instruction having a format of fields for an opcode, one or more fields to indicate a location of a source/destination operand, one or more fields to indicate a location of a first source operand, and one or more fields to indicate a location of a second source operand is used. Wherein the opcode is to indicate that execution circuitry is to: multiply values from corresponding data elements of the first and second sources, add a first subset of the multiplied values to a first value from the source/destination operand and store in a first data element position of the source/destination operand, and add a second subset of the multiplied values to a second value from the source/destination operand and store in a second data element position of the source/destination operand.

24.

发明授权
Apparatus and method for vector horizontal add of signed/unsigned words and doublewords 有权

公开(公告)号：US11249754B2

公开(公告)日：2022-02-15

申请号：US15850131

申请日：2017-12-21

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Mark Charney

IPC: G06F9/30 , G06F7/505

Abstract: An apparatus and method for performing a packed horizontal addition of words and doublewords. One embodiment of a processor includes a decoder to decode a packed horizontal add instruction which includes an opcode and one or more operands used to identify a plurality of packed words; a source register to store a plurality of packed words; execution circuitry to execute the decoded instruction, and a destination register to store a final result as a packed result word in a designated data element position. The execution circuitry includes operand selection circuitry to identify first and second packed words from the source register in accordance with the operands and opcode; adder circuitry to add the two packed words to generate a temporary sum; a temporary storage of at least 17 bits to store the temporary sum; and saturation circuitry to saturate the temporary sum if necessary to generate the final result.

$Apparatus and method for processing fractional reciprocal operations$

25.

发明授权
Apparatus and method for processing fractional reciprocal operations 有权

公开(公告)号：US10768896B2

公开(公告)日：2020-09-08

申请号：US15850636

申请日：2017-12-21

Applicant: Intel Corporation

Inventor： Cristina Anderson , Elmoustapha Ould-Ahmed-Vall , Marius Cornea-Hasegan , Robert Valentine , Mark Charney , Jesus Corbal , Venkateswara Madduri

IPC: G06F7/552 , G06F7/535 , G06F9/30

Abstract: An apparatus and method for performing a reciprocal. For example one embodiment of a processor comprises: a decoder to decode a reciprocal instruction to generate a decoded reciprocal instruction; a source register to store at least one packed input data element; a destination register to store a result data element; and reciprocal execution circuitry to execute the decoded reciprocal instruction, the reciprocal execution circuitry to use a first portion of the packed input data element as an index to a data structure containing a plurality of sets of coefficients to identify a first set of coefficients from the plurality of sets, the reciprocal execution circuitry to generate a reciprocal of the packed input data element using a combination of the coefficients and a second portion of the packed input data element.

26.

发明授权
Apparatus and method for processing reciprocal square root operations 有权

公开(公告)号：US10664237B2

公开(公告)日：2020-05-26

申请号：US15850673

申请日：2017-12-21

Applicant: Intel Corporation

Inventor： Cristina Anderson , Elmoustapha Ould-Ahmed-Vall , Marius Cornea-Hasegan , Robert Valentine , Mark Charney , Jesus Corbal , Venkateswara Madduri

IPC: G06F7/552 , G06F1/03 , G06F9/30

Abstract: An apparatus and method for performing a reciprocal square root. For example one embodiment of a processor comprises: a decoder to decode a reciprocal square root instruction to generate a decoded reciprocal square root instruction; a source register to store at least one packed input data element; a destination register to store a result data element; and reciprocal square root execution circuitry to execute the decoded reciprocal square root instruction, the reciprocal square root execution circuitry to use a first portion of the packed input data element as an index to a data structure containing a plurality of sets of coefficients to identify a first set of coefficients from the plurality of sets, the reciprocal square root execution circuitry to generate a reciprocal square root of the packed input data element using a combination of the coefficients and a second portion of the packed input data element.

27.

发明授权
Apparatus and method for multiplication and accumulation of complex and real packed data elements 有权

公开(公告)号：US10552154B2

公开(公告)日：2020-02-04

申请号：US15721464

申请日：2017-09-29

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Jesus Corbal , Mark Charney , Robert Valentine , Binwei Yang

IPC: G06F9/30 , G06F7/48 , G06F7/52

Abstract: An apparatus and method for multiplying packed real and imaginary components of complex numbers. A method comprises: multiplying selected imaginary and real data elements in a first and second source registers to generate a plurality of imaginary products; adding a first subset of the plurality of imaginary products to generate a first temporary result and adding a second subset of the plurality of imaginary products to generate a second temporary result; negating the first temporary result to generate a third temporary result and the second temporary result to generate a fourth temporary result; accumulating the third temporary result with first data to generate a first final result and accumulating the fourth temporary result with second data to generate a second final result; and storing the first final result and second final.

28.

发明授权
Apparatus and method for performing dual signed and unsigned multiplication of packed data elements 有权

公开(公告)号：US10514924B2

公开(公告)日：2019-12-24

申请号：US15721261

申请日：2017-09-29

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Mark Charney , Robert Valentine , Jesus Corbal , Binwei Yang

IPC: G06F9/30

Abstract: An apparatus and method for performing dual concurrent multiplications of packed data elements. For example one embodiment of a processor comprises: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed doubleword data elements; a second source register to store a second plurality of packed doubleword data elements; and execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to multiply a first doubleword data element from the first source register with a second doubleword data element from the second source register to generate a first quadword product and to concurrently multiply a third doubleword data element from the first source register with a fourth doubleword data element from the second source register to generate a second quadword product; and a destination register to store the first quadword product and the second quadword product as first and second packed quadword data elements.

29.

发明授权
Apparatus and method for shifting quadwords and extracting packed words 有权

公开(公告)号：US10481910B2

公开(公告)日：2019-11-19

申请号：US15721466

申请日：2017-09-29

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Jesus Corbal , Mark Charney , Robert Valentine , Binwei Yang

IPC: G06F9/30

Abstract: An apparatus and method for performing right-shifting operations on packed quadword data. For example, one embodiment of a processor comprises: a decoder to decode a right-shift instruction to generate a decoded right-shift instruction; a first source register to store a plurality of packed quadwords data elements; execution circuitry to execute the decoded right-shift instruction, the execution circuitry comprising shift circuitry to right-shift at least first and second packed quadword data elements from first and second packed quadword data element locations, respectively, in the first source register by an amount specified in an immediate value or in a control value in a second source register, to generate first and second right-shifted quadwords; the execution circuitry to cause selection of 16 most significant bits of the first and second right-shifted quadwords to be written to 16 least significant bit regions of first and second quadword data element locations, respectively, of a destination register; and the destination register to store the specified set of the 16 most significant bits of the first and second right-shifted quadwords.

30.

发明申请
APPARATUS AND METHOD FOR ADDING PACKED DATA ELEMENTS WITH ROTATION AND HALVING 审中-公开

公开(公告)号：US20190196826A1

公开(公告)日：2019-06-27

申请号：US15850071

申请日：2017-12-21

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Mark Charney , Jesus Corbal , Binwei Yang

IPC: G06F9/30 , G06F7/485

CPC classification number: G06F9/30145 , G06F7/485 , G06F9/30101

Abstract: An apparatus and method for performing addition of signed packed data values using rotation and halving. For example, one embodiment of a processor comprises: a decoder to decode an instruction to generate a decoded instruction, the instruction including an opcode, an immediate, and operands identifying a plurality of packed data source registers and a packed data destination register a first source register to store a first plurality of packed signed words; a second source register to store a second plurality of packed signed words; execution circuitry to execute the decoded instruction, the execution circuitry comprising: adder circuitry to add each packed signed word from the first source register with a selected packed signed word from the second source register to generate a plurality of signed word results, the adder circuitry to select each packed signed word from the second source register in accordance with a rotation value in the immediate of the instruction, the rotation value to indicate an amount of rotation to be applied to the packed signed words in the second source register prior to the adder circuitry performing the adding; and a destination register to store the plurality of signed word results in specified data element locations of the destination register.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification