Patent search ap:("Intel Corporation") AND inv:"Mark Charney" Page 4

31.

发明申请
APPARATUS AND METHOD FOR VECTOR MULTIPLY AND ACCUMULATE OF SIGNED DOUBLEWORDS 审中-公开

公开(公告)号：US20190196827A1

公开(公告)日：2019-06-27

申请号：US15850180

申请日：2017-12-21

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Mark Charney , Jesus Corbal

IPC: G06F9/30 , G06F17/16

Abstract: An apparatus and method for performing signed multiplication of packed signed doublewords and accumulation with a signed quadword. For example, one embodiment of a processor comprises: a first source register to store a first plurality of packed signed doubleword data elements; a second source register to store a second plurality of packed signed doubleword data elements; a third source register to store a plurality of packed signed quadword data elements; execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to multiply first and second packed signed doubleword data elements from the first source register with third and fourth packed signed doubleword data elements from the second source register, respectively, to generate first and second temporary signed quadword products, the multiplier circuitry to select the first, second, third, and fourth signed doubleword data elements based on the opcode of the instruction; accumulation circuitry to combine the first temporary signed quadword product with a first packed signed quadword value read from the third source register to generate a first accumulated signed quadword result and to combine the second temporary signed quadword product with a second packed signed quadword value read from the third source register to generate a second accumulated signed quadword result; a destination register or the third source register to store the first accumulated signed quadword result in a first signed quadword data element position and to store the second accumulated signed quadword result in a second signed quadword data element position.

32.

发明申请
FIXED POINT TO FLOATING POINT CONVERSION 审中-公开

公开(公告)号：US20190196818A1

公开(公告)日：2019-06-27

申请号：US16291245

申请日：2019-03-04

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Mark Charney

IPC: G06F9/30 , G06F7/483

CPC classification number: G06F9/30025 , G06F7/483 , H03M7/24

Abstract: Embodiments of instructions and methods of execution of said instructions and resources to execute said instructions are detailed. For example, in an embodiment, a processor comprising: decode circuitry to decode an instruction having fields for an opcode, a packed data source operand identifier, and a packed data destination operand identifier; and execution circuitry to execute the decoded instruction to convert a data element from a least significant packed data element position of the identified packed data source operand from a fixed-point representation to a floating point representation, store the floating point representation into a 32-bit least significant packed data element position of the identified packed data destination operand, and zero all remaining packed data elements of the identified packed data destination operand is described.

33.

发明申请
APPARATUS AND METHOD FOR CONVERTING A FLOATING-POINT VALUE FROM HALF PRECISION TO SINGLE PRECISION 审中-公开

公开(公告)号：US20190163474A1

公开(公告)日：2019-05-30

申请号：US15824339

申请日：2017-11-28

Applicant: Intel Corporation

Inventor： Robert Valentine , Mark Charney , Raanan Sade , Elmoustapha Ould-Ahmed-Vall , Jesus Corbal

IPC: G06F9/30

Abstract: An embodiment of the invention is a processor including execution circuitry to, in response to a decoded instruction, convert a half-precision floating-point value to a single-precision floating-point value and store the single-precision floating-point value in each of the plurality of element locations of a destination register. The processor also includes a decoder and the destination register. The decoder is to decode an instruction to generate the decoded instruction.

34.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR MULTIPLICATION AND ACCUMULATION OF VECTOR PACKED SIGNED VALUES 审中-公开

公开(公告)号：US20190102198A1

公开(公告)日：2019-04-04

申请号：US15721616

申请日：2017-09-29

Applicant: Intel Corporation

Inventor： Venkateswara R. Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Mark Charney

IPC: G06F9/30

Abstract: Embodiments of systems, apparatuses, and methods for multiplication and accumulation of signed data values in a processor are described. For example, execution circuitry executes a decoded instruction to multiply selected signed data values from a plurality of packed data element positions in first and second packed data source operands to generate a plurality of first signed result values, sum the plurality of first signed result values to generate one or more second signed result values, accumulate the one or more signed result values with one or more data values from a destination operand to generate one or more third signed result values, and store the one or more third signed result values in one or more packed data element positions in the destination operand.

35.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR MULTIPLICATION, NEGATION, AND ACCUMULATION OF VECTOR PACKED SIGNED VALUES 审中-公开

公开(公告)号：US20190102185A1

公开(公告)日：2019-04-04

申请号：US15721599

申请日：2017-09-29

Applicant: Intel Corporation

Inventor： Venkateswara R. Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Mark Charney

IPC: G06F9/30 , G06F7/48 , G06F7/544 , G06F17/16

Abstract: Embodiments of systems, apparatuses, and methods for multiplication, negation, and accumulation of data values in a processor are described. For example, execution circuitry executes a decoded instruction to multiply selected data values from a plurality of packed data element positions in first and second packed data source operands to generate a plurality of first result values, sum the plurality of first result values to generate one or more second result values, negate the one or more second result values to generate one or more third result values, accumulate the one or more third result values with one or more data values from the destination operand to generate one or more fourth result values, and store the one or more third result values in one or more packed data element positions in the destination operand.

36.

发明授权
BFLOAT16 fused multiply instructions 有权

公开(公告)号：US12229554B2

公开(公告)日：2025-02-18

申请号：US17463405

申请日：2021-08-31

Applicant: Intel Corporation

Inventor： Alexander Heinecke , Menachem Adelman , Robert Valentine , Zeev Sperber , Amit Gradstein , Mark Charney , Evangelos Georganas , Dhiraj Kalamkar , Christopher Hughes , Cristina Anderson

IPC: G06F9/30 , G06F7/544

Abstract: Techniques for performing BF16 FMA in response to an instruction are described. In some examples, an instruction has fields for an opcode, an identification of location of a packed data source/destination operand (a first source), an identification of a location of a second packed data source operand, an identification of a location of a third packed data source operand, and an identification of location of a packed data source/destination operand, wherein the opcode is to indicate operand ordering and that execution circuitry is to, per data element position, perform a BF16 value fused multiply-accumulate operation using the first, second, and third source operands and store a result in a corresponding data element position of the source/destination operand.

37.

发明授权
Dual sum of quadword 16×16 multiply and accumulate 有权

公开(公告)号：US12204903B2

公开(公告)日：2025-01-21

申请号：US17359522

申请日：2021-06-26

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Cristina Anderson , Robert Valentine , Mark Charney , Vedvyas Shanbhogue

IPC: G06F9/30

Abstract: Techniques for matrix multiplication are described. In some examples, a single instruction having a format of fields for an opcode, one or more fields to indicate a location of a source/destination operand, one or more fields to indicate a location of a first source operand, and one or more fields to indicate a location of a second source operand is used. Wherein the opcode is to indicate that execution circuitry is to: multiply values from corresponding data elements of the first and second sources, add a first subset of the multiplied values to a first value from the source/destination operand and store in a first data element position of the source/destination operand, and add a second subset of the multiplied values to a second value from the source/destination operand and store in a second data element position of the source/destination operand.

38.

发明公开
INSTRUCTIONS TO CONVERT FROM FP16 TO FP8 审中-公开

公开(公告)号：US20240045684A1

公开(公告)日：2024-02-08

申请号：US17958380

申请日：2022-10-01

Applicant: Intel Corporation

Inventor： Alexander Heinecke , Menachem Adelman , Mark Charney , Evangelos Georganas , Amit Gradstein , Christopher Hughes , Naveen Mellempudi , Simon Rubanovich , Uri Sherman , Zeev Sperber , Robert Valentine

IPC: G06F9/30

CPC classification number: G06F9/30145 , G06F9/30036 , G06F9/30018

Abstract: Techniques for converting FP16 to BF8 using bias are described. An example embodiment utilizes decoder circuitry to decode a single instruction, the single instruction to include one or more fields to identify a first source operand, one or more fields to identify a second source operand, one or more fields to identify a source/destination operand, and one or more fields for an opcode, wherein the opcode is to indicate that execution circuitry is to convert packed half-precision data from the identified first and second sources to packed FP8 data using bias terms from the identified source/destination operand and store the packed FP8 data into corresponding data element positions of the identified source/destination operand; and execution circuitry to execute the decoded instruction according to the opcode to convert packed half-precision data from the identified first and second sources to packed FP8 data using bias terms from the identified source/destination operand and store the packed FP8 data into corresponding data element positions of the identified source/destination operand.

39.

发明授权
Apparatus and method for vector multiply and accumulate of packed bytes 有权

公开(公告)号：US11768681B2

公开(公告)日：2023-09-26

申请号：US15879419

申请日：2018-01-24

Applicant: Intel Corporation

Inventor： Alexander Heinecke , Dipankar Das , Robert Valentine , Mark Charney

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/3001 , G06F9/3013 , G06F9/30014 , G06F9/3016 , G06F9/30018 , G06F9/30036 , G06F9/3893

Abstract: An apparatus and method for performing multiply-accumulate operations. For example, one embodiment of a processor comprises: a decoder to decode instructions; a first source register to store a first plurality of packed bytes; a second source register to store a second plurality of packed bytes; a third source register to store a plurality of packed doublewords; execution circuitry to execute a first instruction, the execution circuitry comprising: extension circuitry to sign-extend or zero-extend the first and second plurality of packed bytes to generate a first and second plurality of words corresponding to the first and second plurality of packed bytes; multiplier circuitry to multiply each of the first plurality of words with a corresponding one of the second plurality of words to generate a plurality of temporary products; adder circuitry to add at least a first set of the temporary products to generate a first temporary sum; accumulation circuitry to combine the first temporary sum with a first packed doubleword value from a first doubleword location in the third source register to generate a first accumulated doubleword result; a destination register to store the first accumulated doubleword result in the first doubleword location.

40.

发明授权
Vector friendly instruction format and execution thereof 有权

公开(公告)号：US11740904B2

公开(公告)日：2023-08-29

申请号：US17524624

申请日：2021-11-11

Applicant: Intel Corporation

Inventor： Robert C. Valentine , Jesus Corbal San Adrian , Roger Espasa Sans , Robert D. Cavin , Bret L. Toll , Santiago Galan Duran , Jeffrey G. Wiedemeier , Sridhar Samudrala , Milind Baburao Girkar , Edward Thomas Grochowski , Jonathan Cannon Hall , Dennis R. Bradford , Elmoustapha Ould-Ahmed-Vall , James C Abel , Mark Charney , Seth Abraham , Suleyman Sair , Andrew Thomas Forsyth , Lisa Wu , Charles Yount

IPC: G06F9/30 , G06F9/34 , H01L29/78 , H01L29/66 , H01L29/786 , H01L29/775

CPC classification number: G06F9/30145 , G06F9/3001 , G06F9/30014 , G06F9/30025 , G06F9/30032 , G06F9/30036 , G06F9/30047 , G06F9/30149 , G06F9/30181 , G06F9/30185 , G06F9/30192 , G06F9/34 , H01L29/66553 , H01L29/775 , H01L29/7831 , H01L29/78696 , G06F9/30018 , H01L29/66

Abstract: A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification