Patent search ap:("INTEL Corporation") AND inv:"Valentine Page Robert"

1.

发明公开
SYSTEMS AND METHODS TO TRANSPOSE VECTORS ON-THE-FLY WHILE LOADING FROM MEMORY 审中-实审

公开(公告)号：EP4375835A3

公开(公告)日：2024-08-14

申请号：EP24169357.1

申请日：2019-10-15

Applicant: Intel Corporation

Inventor： Heinecke, Alexander F. , Georganas, Evangelos , Hughes, Christopher J. , Sade, Raanan , Valentine, Robert

IPC: G06F9/30

CPC classification number: G06F9/30032 , G06F9/30036 , G06F9/30109 , G06F9/30038

Abstract: Disclosed embodiments relate to transposing vectors while loading from memory. In one example, a processor comprises: a register file comprising one or more vector registers; a memory interface to read a plurality of data elements from a memory; fetch circuitry to fetch an instruction; decode circuitry to decode the instruction, and execution circuitry to execute the instruction. The instruction includes a plurality of fields to indicate an opcode, a subset of the plurality of data elements to be broadcast, and locations of the plurality of data elements, the plurality of data elements arranged in a corresponding plurality of relative positions, wherein the plurality of data elements include a first group of data elements and a second group of data elements. The execution circuitry performs a permute operation and a broadcast operation in accordance with the instruction, wherein the broadcast operation is to cause the subset of the plurality of data elements to be broadcast to a plurality of the relative positions associated with a corresponding plurality of other subsets of the plurality of data elements, the subset of the plurality of data elements to replace the other corresponding subsets at the plurality of relative positions.

2.

发明公开
SYSTEMS, METHODS, AND APPARATUSES FOR MATRIX ADD, SUBTRACT, AND MULTIPLY 审中-实审

公开(公告)号：EP4336369A3

公开(公告)日：2024-06-19

申请号：EP24153968.3

申请日：2017-07-01

Applicant: Intel Corporation

Inventor： Valentine, Robert , Baum, Dan , Sperber, Zeev , Corbal, Jesus , Ould-Ahmed-Vall, ElMoustapha , Toll, Bret L. , Charney, Mark J. , Ziv, Barukh , Heinecke, Alexander , Girkar, Milind , Rubanovich, Simon

IPC: G06F9/30 , G06F7/00 , G06F9/345 , G06F9/38

CPC classification number: G06F9/30036 , G06F2212/45520130101 , G06F12/0207 , G06F2212/45420130101 , G06F9/3001 , G06F7/5443 , G06F9/3861 , G06F9/30014 , G06F9/3016

Abstract: Embodiments detailed herein relate to matrix operations. For example, in some embodiments, a processor comprises decode circuitry to decode an instruction having fields for an opcode, for identifying a first plurality of source vectors, for identifying a second plurality of source vectors, and for identifying a plurality of destination vectors; and execution circuitry to execute the decoded instruction to, for each data element position of each of the identified first plurality of source vectors: add a first data value at that data element position to a second data value at a corresponding data element position of a corresponding one of the identified second plurality of source vectors, and store a result of the addition into a corresponding data element position of a corresponding one of the identified plurality of destination vectors.

3.

发明公开
SYSTEMS AND METHODS FOR PERFORMING 16-BIT FLOATING-POINT MATRIX DOT PRODUCT INSTRUCTIONS 审中-公开

公开(公告)号：EP4276609A3

公开(公告)日：2024-02-14

申请号：EP23200278.2

申请日：2019-10-08

Applicant: Intel Corporation

Inventor： Heinecke, Alexander F. , Valentine, Robert , Charney, Mark J. , Sade, Raanan , Adelman, Menachem , Sperber, Zeev , Gradstein, Amit , Rubanovich, Simon

IPC: G06F9/30

Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processing unit comprises fetch circuitry to fetch an instruction, decode circuitry to decode the instruction, the instruction having a first field to specify a first storage location of a plurality of data elements corresponding to a first matrix having M rows by N columns of 32-bit single precision floating-point data elements, a second field to specify a second storage location of a plurality of data elements corresponding to a second matrix having M rows by K columns of pairs of 16-bit floating-point data elements having a bfloat16 format, and a third field to specify a third storage location of a plurality of data elements corresponding to a third matrix having K rows by N columns of pairs of 16-bit floating-point data elements having the bfloat16 format, and execution circuitry coupled with the decode circuitry, the execution circuitry to perform operations corresponding to the instruction.

4.

发明公开
INSTRUCTIONS TO CONVERT FROM FP16 TO FP8 审中-公开

公开(公告)号：EP4318229A1

公开(公告)日：2024-02-07

申请号：EP23189559.0

申请日：2023-08-03

Applicant: Intel Corporation

Inventor： Heinecke, Alexander , Adelman, Menachem , Charney, Mark , Georganas, Evangelos , Gradstein, Amit , Hughes, Christopher , Mellempudi, Naveen , Rubanovich, Simon , Sherman, Uri , Sperber, Zeev , Valentine, Robert

IPC: G06F9/30

Abstract: Techniques for converting FP16 to BF8 using bias are described. An example embodiment utilizes decoder circuitry to decode a single instruction, the single instruction to include one or more fields to identify a first source operand, one or more fields to identify a second source operand, one or more fields to identify a source/destination operand, and one or more fields for an opcode, wherein the opcode is to indicate that execution circuitry is to convert packed half-precision data from the identified first and second sources to packed FP8 data using bias terms from the identified source/destination operand and store the packed FP8 data into corresponding data element positions of the identified source/destination operand; and execution circuitry to execute the decoded instruction according to the opcode to convert packed half-precision data from the identified first and second sources to packed FP8 data using bias terms from the identified source/destination operand and store the packed FP8 data into corresponding data element positions of the identified source/destination operand.

5.

发明公开
CONVERSION INSTRUCTIONS 审中-公开

公开(公告)号：EP4202659A1

公开(公告)日：2023-06-28

申请号：EP22210978.7

申请日：2022-12-02

Applicant: Intel Corporation

Inventor： Valentine, Robert , Wong, Wing Shek , Combs, Jonathan , Charney, Mark

IPC: G06F9/30

Abstract: Techniques for data type conversion are described. An example uses an instruction that is to include fields for an opcode, an identification of source operand location, and an identification of destination operand location, wherein the opcode is to indicate instruction processing circuitry is to convert a 16-bit floating-point value from the identified source operand location into a 32-bit floating point value and store that 32-bit floating point value in one or more data element positions of the identified destination operand.

6.

发明公开
BFLOAT16 SQUARE ROOT AND/OR RECIPROCAL SQUARE ROOT INSTRUCTIONS 审中-公开

公开(公告)号：EP4141657A1

公开(公告)日：2023-03-01

申请号：EP22185990.3

申请日：2022-07-20

Applicant: INTEL Corporation

Inventor： Adelman, Menachem , Heinecke, Alexander , Valentine, Robert , Sperber, Zeev , Gradstein, Amit , Charney, Mark , Georganas, Evangelos , Kalamkar, Dhiraj , Hughes, Christopher , Anderson, Cristina

IPC: G06F9/30

Abstract: Techniques for performing square root or reciprocal square root calculations on BF16 data elements in response to an instruction are described. An example of an instruction is one that includes fields for an opcode, an identification of a location of a packed data source operand, and an identification of a packed data destination operand, wherein the opcode is to indicate that execution circuitry is to perform, for each data element position of the packed data source operand, a calculation of a square root value of a BF16 data element in that position and store a result of each square root into a corresponding data element position of the packed data destination operand.

7.

发明公开
INSTRUCTIONS TO CONVERT FROM FP16 TO BF8 审中-公开

公开(公告)号：EP4020178A1

公开(公告)日：2022-06-29

申请号：EP21198429.9

申请日：2021-09-23

Applicant: INTEL Corporation

Inventor： Heinecke, Alexander , Mellempudi, Naveen , Valentine, Robert , Charney, Mark , Hughes, Christopher , Georganas, Evangelos , Sperber, Zeev , Gradstein, Amit , Rubanovich, Simon

IPC: G06F9/30

Abstract: Techniques for converting FP16 to BF8 using bias are described. An exemplary embodiment utilizes decoder circuitry to decode a single instruction, the single instruction to include one or more fields to identify a first source operand, one or more fields to identify a second source operand, one or more fields to identify a source/destination operand, and one or more fields for an opcode, wherein the opcode is to indicate that execution circuitry is to convert packed half-precision data from the identified first and second sources to packed bfloat8 data using bias terms from the identified source/destination operand and store the packed bfloat8 data into corresponding data element positions of the identified source/destination operand; and execution circuitry to execute the decoded instruction according to the opcode to convert packed half-precision data from the identified first and second sources to packed bfloat8 data using bias terms from the identified source/destination operand and store the packed bfloat8 data into corresponding data element positions of the identified source/destination operand.

8.

发明公开
SYSTEMS, APPARATUSES, AND METHODS FOR FUSED MULTIPLY ADD 审中-公开

公开(公告)号：EP3989062A1

公开(公告)日：2022-04-27

申请号：EP21207387.8

申请日：2016-10-20

Applicant: INTEL Corporation

Inventor： Valentine, Robert , Ryvchin, Galina , Majcher, Piotr , Charney, Mark J. , Ould-Ahmed-Vall, ElMoustapha , Corbal, Jesus , Girkar, Milind B. , Sperber, Zeev , Rubanovich, Simon , Gradstein, Amit

IPC: G06F9/30 , G06F15/76

Abstract: In some embodiments, a single instruction is provided that has an opcode, a first field to represent a packed data source/destination operand, a second field to represent a first packed data source operand, and a third field to represent a second packed data source operand. Packed data elements of the first and second packed data source operands are of a first size and packed data elements of the packed data source/destination operand are of a second size greater than the first size. In response to the single instruction, execution circuitry of an apparatus, according to the opcode of the single instruction, for each packed data element position of the packed data source/destination operand is configured to: sign extend a plurality of packed signed data words from a corresponding packed data element position of the first packed data source operand; sign extend a plurality of packed signed data words from a corresponding packed data element position of the second packed data source operand; multiply each of the plurality of sign extended packed signed data words from a corresponding packed data element position of the first packed data source operand with a corresponding one of the plurality of sign extended packed signed data words from a corresponding packed data element position of the second packed data source operand to result in a plurality of results; add the plurality of results with a packed data element of the second size of a corresponding packed data element position of the packed data source/destination operand to result in an addition result, and saturate the addition result to result in a saturated addition result if a width of the addition result exceeds a width of the second size; and store the addition result or the saturated addition result in the corresponding packed data element position of the packed data source/destination operand.

9.

发明公开
SYSTEMS, APPARATUSES, AND METHODS FOR FUSED MULTIPLY ADD 审中-公开

公开(公告)号：EP3971709A1

公开(公告)日：2022-03-23

申请号：EP21207379.5

申请日：2016-10-20

Applicant: INTEL Corporation

Inventor： Valentine, Robert , Ryvchin, Galina , Majcher, Piotr , Charney, Mark J. , Ould-Ahmed-Vall, ElMoustapha , Corbal, Jesus , Girkar, Milind B. , Sperber, Zeev , Rubanovich, Simon , Gradstein, Amit

IPC: G06F9/30 , G06F15/76

Abstract: In some embodiments, a single instruction is provided that has an opcode, a first field to represent a packed data source/destination operand, a second field to represent a first packed data source operand, and a third field to represent a second packed data source operand. Packed data elements of the first and second packed data source operands are of a first size and packed data elements of the packed data source/destination operand are of a second size greater than the first size. In response to the single instruction, execution circuitry of an apparatus, according to the opcode of the single instruction, for each packed data element position of the packed data source/destination operand is configured to: sign extend a plurality of packed signed data bytes from a corresponding packed data element position of the first packed data source operand; zero extend a plurality of packed unsigned data bytes from a corresponding packed data element position of the second packed data source operand; multiply each of the sign extended plurality of packed signed data bytes from the first packed data source operand with a corresponding one of the zero extended plurality of packed unsigned data bytes from the second packed data source operand to result in a plurality of results; add the plurality of results with a packed data element of the second size of a corresponding packed data element position of the packed data source/destination operand to result in an addition result, and saturate the addition result to result in a saturated addition result if a width of the addition result exceeds a width of the second size; and store the addition result or the saturated addition result in the corresponding packed data element position of the packed data source/destination operand.

10.

发明公开
LOADING AND STORING MATRIX DATA WITH DATATYPE CONVERSION 审中-公开

公开(公告)号：EP3929734A1

公开(公告)日：2021-12-29

申请号：EP20209949.5

申请日：2020-11-26

Applicant: INTEL Corporation

Inventor： Adelman, Menachem , Valentine, Robert , Stupp, Gideon , Pollak, Yaroslav , Gradstein, Amit , Rubanovich, Simon , Sperber, Zeev , Charney, Mark J. , Hughes, Christopher J. , Heinecke, Alexander F. , Georganas, Evangelos

IPC: G06F9/30

Abstract: Embodiments for loading and storing matrix data with datatype conversion are disclosed. In an embodiment, a processor includes a decoder and execution circuitry. The decoder is to decode an instruction having a format including an opcode field to specify an opcode, a first destination operand field to specify a first destination matrix location, and a first source operand field to specify a first source matrix location. The execution circuitry is to, in response to the decoded instruction, convert data elements from a plurality of source element locations of a first source matrix specified by the first source matrix location from a first datatype to a second datatype to generate a plurality of converted data elements and to store each of the plurality of converted data elements in one of a plurality of destination element locations in a first destination matrix specified by the first destination matrix location.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification