Patent search ap:("Intel Corporation") AND inv:"Venkateswara Madduri" Page 2

11.

发明授权
Fixed point to floating point conversion 有权

公开(公告)号：US10656942B2

公开(公告)日：2020-05-19

申请号：US16291245

申请日：2019-03-04

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Mark Charney

IPC: G06F5/00 , G06F9/30 , G06F7/483 , H03M7/24

Abstract: Embodiments of instructions and methods of execution of said instructions and resources to execute said instructions are detailed. For example, in an embodiment, a processor comprising: decode circuitry to decode an instruction having fields for an opcode, a packed data source operand identifier, and a packed data destination operand identifier; and execution circuitry to execute the decoded instruction to convert a data element from a least significant packed data element position of the identified packed data source operand from a fixed-point representation to a floating point representation, store the floating point representation into a 32-bit least significant packed data element position of the identified packed data destination operand, and zero all remaining packed data elements of the identified packed data destination operand is described.

12.

发明申请
APPARATUS AND METHOD FOR VECTOR MULTIPLY AND ACCUMULATE OF SIGNED DOUBLEWORDS 审中-公开

公开(公告)号：US20190196827A1

公开(公告)日：2019-06-27

申请号：US15850180

申请日：2017-12-21

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Mark Charney , Jesus Corbal

IPC: G06F9/30 , G06F17/16

Abstract: An apparatus and method for performing signed multiplication of packed signed doublewords and accumulation with a signed quadword. For example, one embodiment of a processor comprises: a first source register to store a first plurality of packed signed doubleword data elements; a second source register to store a second plurality of packed signed doubleword data elements; a third source register to store a plurality of packed signed quadword data elements; execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to multiply first and second packed signed doubleword data elements from the first source register with third and fourth packed signed doubleword data elements from the second source register, respectively, to generate first and second temporary signed quadword products, the multiplier circuitry to select the first, second, third, and fourth signed doubleword data elements based on the opcode of the instruction; accumulation circuitry to combine the first temporary signed quadword product with a first packed signed quadword value read from the third source register to generate a first accumulated signed quadword result and to combine the second temporary signed quadword product with a second packed signed quadword value read from the third source register to generate a second accumulated signed quadword result; a destination register or the third source register to store the first accumulated signed quadword result in a first signed quadword data element position and to store the second accumulated signed quadword result in a second signed quadword data element position.

13.

发明申请
FIXED POINT TO FLOATING POINT CONVERSION 审中-公开

公开(公告)号：US20190196818A1

公开(公告)日：2019-06-27

申请号：US16291245

申请日：2019-03-04

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Mark Charney

IPC: G06F9/30 , G06F7/483

CPC classification number: G06F9/30025 , G06F7/483 , H03M7/24

Abstract: Embodiments of instructions and methods of execution of said instructions and resources to execute said instructions are detailed. For example, in an embodiment, a processor comprising: decode circuitry to decode an instruction having fields for an opcode, a packed data source operand identifier, and a packed data destination operand identifier; and execution circuitry to execute the decoded instruction to convert a data element from a least significant packed data element position of the identified packed data source operand from a fixed-point representation to a floating point representation, store the floating point representation into a 32-bit least significant packed data element position of the identified packed data destination operand, and zero all remaining packed data elements of the identified packed data destination operand is described.

14.

发明授权
Apparatus and method for performing dual signed and unsigned multiplication of packed data elements 有权

公开(公告)号：US11809867B2

公开(公告)日：2023-11-07

申请号：US17027230

申请日：2020-09-21

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Jesus Corbal , Mark Charney , Robert Valentine , Binwei Yang

IPC: G06F9/30 , G06F7/00

CPC classification number: G06F9/3001 , G06F7/00 , G06F9/30014 , G06F9/3016 , G06F9/30036

Abstract: An apparatus and method for performing dual concurrent multiplications of packed data elements. For example one embodiment of a processor comprises: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed byte data elements; a second source register to store a second plurality of packed byte data elements; execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to concurrently multiply each of the packed byte data elements of the first plurality with a corresponding packed byte data element of the second plurality to generate a plurality of products; adder circuitry to add specified sets of the products to generate temporary results for each set of products; zero-extension or sign-extension circuitry to zero-extend or sign-extend the temporary result for each set to generate an extended temporary result for each set; accumulation circuitry to combine each of the extended temporary results with a selected packed data value stored in a third source register to generate a plurality of final results; and a destination register to store the plurality of final results as a plurality of packed data elements in specified data element positions.

15.

发明授权
Apparatus and method for performing dual signed and unsigned multiplication of packed data elements 有权

公开(公告)号：US11573799B2

公开(公告)日：2023-02-07

申请号：US17226986

申请日：2021-04-09

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Mark Charney , Robert Valentine , Jesus Corbal , Binwei Yang

IPC: G06F9/30

Abstract: An apparatus and method for performing dual concurrent multiplications of packed data elements. For example one embodiment of a processor comprises: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed doubleword data elements; a second source register to store a second plurality of packed doubleword data elements; and execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to multiply a first doubleword data element from the first source register with a second doubleword data element from the second source register to generate a first quadword product and to concurrently multiply a third doubleword data element from the first source register with a fourth doubleword data element from the second source register to generate a second quadword product; and a destination register to store the first quadword product and the second quadword product as first and second packed quadword data elements.

16.

发明授权
Apparatus and method for performing dual signed and unsigned multiplication of packed data elements 有权

公开(公告)号：US10802826B2

公开(公告)日：2020-10-13

申请号：US15721412

申请日：2017-09-29

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Jesus Corbal , Mark Charney , Robert Valentine , Binwei Yang

IPC: G06F9/30

Abstract: An apparatus and method for performing dual concurrent multiplications of packed data elements. For example one embodiment of a processor comprises: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed byte data elements; a second source register to store a second plurality of packed byte data elements; execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to concurrently multiply each of the packed byte data elements of the first plurality with a corresponding packed byte data element of the second plurality to generate a plurality of products; adder circuitry to add specified sets of the products to generate temporary results for each set of products; zero-extension or sign-extension circuitry to zero-extend or sign-extend the temporary result for each set to generate an extended temporary result for each set; accumulation circuitry to combine each of the extended temporary results with a selected packed data value stored in a third source register to generate a plurality of final results; and a destination register to store the plurality of final results as a plurality of packed data elements in specified data element positions.

17.

发明授权
Apparatus and method for vector multiply and accumulate of unsigned doublewords 有权

公开(公告)号：US10664270B2

公开(公告)日：2020-05-26

申请号：US15850412

申请日：2017-12-21

Applicant: Intel Corporation

Inventor： Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Mark Charney , Jesus Corbal , Venkateswara Madduri

IPC: G06F9/30 , G06F7/544

Abstract: An apparatus and method for performing signed multiplication of packed signed/unsigned doublewords and accumulation with a quadword. For example, one embodiment of a processor comprises: a first source register to store a first plurality of packed doubleword data elements; a second source register to store a second plurality of packed doubleword data elements; a third source register to store a plurality of packed quadword data elements; execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to multiply first and second packed doubleword data elements from the first source register with third and fourth packed doubleword data elements from the second source register, respectively, to generate first and second temporary quadword products, the multiplier circuitry to select the first, second, third, and fourth doubleword data elements based on the opcode of the instruction; accumulation circuitry to combine the first temporary quadword product with a first packed quadword value read from the third source register to generate a first accumulated quadword result and to combine the second temporary quadword product with a second packed quadword value read from the third source register to generate a second accumulated quadword result; a destination register or the third source register to store the first accumulated quadword result in a first quadword data element position and to store the second accumulated quadword result in a second quadword data element position.

18.

发明授权
Apparatus and method for vector multiply and accumulate of signed doublewords 有权

公开(公告)号：US10514923B2

公开(公告)日：2019-12-24

申请号：US15850180

申请日：2017-12-21

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Mark Charney , Jesus Corbal

IPC: G06F9/30 , G06F17/16 , G06F7/00

Abstract: An apparatus and method for performing signed multiplication of packed signed doublewords and accumulation with a signed quadword. For example, one embodiment of a processor comprises: a first source register to store a first plurality of packed signed doubleword data elements; a second source register to store a second plurality of packed signed doubleword data elements; a third source register to store a plurality of packed signed quadword data elements; execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to multiply first and second packed signed doubleword data elements from the first source register with third and fourth packed signed doubleword data elements from the second source register, respectively, to generate first and second temporary signed quadword products, the multiplier circuitry to select the first, second, third, and fourth signed doubleword data elements based on the opcode of the instruction; accumulation circuitry to combine the first temporary signed quadword product with a first packed signed quadword value read from the third source register to generate a first accumulated signed quadword result and to combine the second temporary signed quadword product with a second packed signed quadword value read from the third source register to generate a second accumulated signed quadword result; a destination register or the third source register to store the first accumulated signed quadword result in a first signed quadword data element position and to store the second accumulated signed quadword result in a second signed quadword data element position.

19.

发明申请
FLOATING POINT TO FIXED POINT CONVERSION 审中-公开

公开(公告)号：US20190199370A1

公开(公告)日：2019-06-27

申请号：US16291231

申请日：2019-03-04

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Mark Charney

IPC: H03M7/24 , H03M7/40

Abstract: Embodiments of an instruction, its operation, and executional support for the instruction are described. In some embodiments, a processor comprises decode circuitry to decode an instruction having fields for an opcode, a packed data source operand identifier, and a packed data destination operand identifier; and execution circuitry to execute the decoded instruction to convert a single precision floating point data element of a least significant packed data element position of the identified packed data source operand to a fixed-point representation, store the fixed-point representation as 32-bit integer and a 32-bit integer exponent in the two least significant packed data element positions of the identified packed data destination operand, and zero of all remaining packed data elements of the identified packed data destination operand.

20.

发明申请
APPARATUS AND METHOD FOR VECTOR HORIZONTAL ADD OF SIGNED/UNSIGNED WORDS AND DOUBLEWORDS 审中-公开

公开(公告)号：US20190196823A1

公开(公告)日：2019-06-27

申请号：US15850131

申请日：2017-12-21

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Mark Charney

IPC: G06F9/30 , G06F7/50

CPC classification number: G06F9/30036 , G06F7/50 , G06F9/3001 , G06F9/30098 , G06F9/30145

Abstract: An apparatus and method for performing a packed horizontal addition of words and doublewords. For example, one embodiment of a processor comprises: a decoder to decode a packed horizontal add instruction to generate a decoded packed horizontal add instruction, the packed horizontal add instruction including an opcode and operands identifying a plurality of packed words; a source register to store a first plurality of packed words; execution circuitry to execute the decoded instruction, the execution circuitry comprising: operand selection circuitry to identify first and second packed words from the source register in accordance with the operand and opcode of the packed horizontal add instruction; adder circuitry to add the first and second packed words to generate a temporary sum; a temporary storage of at least 17 bits to store the temporary sum; saturation circuitry to saturate the temporary sum if necessary to generate a final result; a destination register to store the final result as a packed result word in a designated data element position.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification