Patent search ap:("Intel Corporation") AND inv:"Menachem Adelman" Page 1

1.

发明授权
BFLOAT16 fused multiply instructions 有权

公开(公告)号：US12229554B2

公开(公告)日：2025-02-18

申请号：US17463405

申请日：2021-08-31

Applicant: Intel Corporation

Inventor： Alexander Heinecke , Menachem Adelman , Robert Valentine , Zeev Sperber , Amit Gradstein , Mark Charney , Evangelos Georganas , Dhiraj Kalamkar , Christopher Hughes , Cristina Anderson

IPC: G06F9/30 , G06F7/544

Abstract: Techniques for performing BF16 FMA in response to an instruction are described. In some examples, an instruction has fields for an opcode, an identification of location of a packed data source/destination operand (a first source), an identification of a location of a second packed data source operand, an identification of a location of a third packed data source operand, and an identification of location of a packed data source/destination operand, wherein the opcode is to indicate operand ordering and that execution circuitry is to, per data element position, perform a BF16 value fused multiply-accumulate operation using the first, second, and third source operands and store a result in a corresponding data element position of the source/destination operand.

2.

发明公开
MATRIX TRANSPOSE AND MULTIPLY 审中-公开

公开(公告)号：US20240329938A1

公开(公告)日：2024-10-03

申请号：US18607024

申请日：2024-03-15

Applicant: Intel Corporation

Inventor： Menachem Adelman , Robert Valentine , Barukh Ziv , Amit Gradstein , Simon Rubanovich , Zeev Sperber , Mark J. Charney , Christopher J. Hughes , Alexander F. Heinecke , Evangelos Georganas , Binh Pham

IPC: G06F7/78 , G06F9/30 , G06F17/16

CPC classification number: G06F7/78 , G06F9/3001 , G06F9/3016 , G06F17/16

Abstract: Embodiments for a matrix transpose and multiply operation are disclosed. In an embodiment, a processor includes a decoder and execution circuitry. The decoder is to decode an instruction having a format including an opcode field to specify an opcode, a first destination operand field to specify a destination matrix location, a first source operand field to specify a first source matrix location, and a second source operand field to specify a second source matrix location. The execution circuitry is to, in response to the decoded instruction, transpose the first source matrix to generate a transposed first source matrix, perform a matrix multiplication using the transposed first source matrix and the second source matrix to generate a result, and store the result in a destination matrix location.

3.

发明公开
SYSTEMS AND METHODS FOR PERFORMING 8-BIT FLOATING-POINT VECTOR DOT PRODUCT INSTRUCTIONS 审中-公开

公开(公告)号：US20240045689A1

公开(公告)日：2024-02-08

申请号：US17958377

申请日：2022-10-01

Applicant: Intel Corporation

Inventor： Alexander Heinecke , Menachem Adelman , Evangelos Georganas , Amit Gradstein , Christopher Hughes , Naveen Mellempudi , Simon Rubanovich , Uri Sherman , Zeev Sperber

IPC: G06F9/30 , G06F7/487 , G06F17/16 , G06F9/38

CPC classification number: G06F9/3016 , G06F7/4876 , G06F17/16 , G06F9/3802 , G06F9/3013 , G06F9/3001

Abstract: Disclosed embodiments relate to systems and methods for performing 8-bit floating-point vector dot product instructions. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first source, second source, and destination vectors, the opcode to indicate execution circuitry is to multiply pairs of 8-bit floating-point formatted elements of the specified first and second sources, and accumulate the resulting products with previous contents of a corresponding single-precision element of the specified destination, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.

4.

发明公开
INSTRUCTIONS TO CONVERT FROM FP16 TO FP8 审中-公开

公开(公告)号：US20240045684A1

公开(公告)日：2024-02-08

申请号：US17958380

申请日：2022-10-01

Applicant: Intel Corporation

Inventor： Alexander Heinecke , Menachem Adelman , Mark Charney , Evangelos Georganas , Amit Gradstein , Christopher Hughes , Naveen Mellempudi , Simon Rubanovich , Uri Sherman , Zeev Sperber , Robert Valentine

IPC: G06F9/30

CPC classification number: G06F9/30145 , G06F9/30036 , G06F9/30018

Abstract: Techniques for converting FP16 to BF8 using bias are described. An example embodiment utilizes decoder circuitry to decode a single instruction, the single instruction to include one or more fields to identify a first source operand, one or more fields to identify a second source operand, one or more fields to identify a source/destination operand, and one or more fields for an opcode, wherein the opcode is to indicate that execution circuitry is to convert packed half-precision data from the identified first and second sources to packed FP8 data using bias terms from the identified source/destination operand and store the packed FP8 data into corresponding data element positions of the identified source/destination operand; and execution circuitry to execute the decoded instruction according to the opcode to convert packed half-precision data from the identified first and second sources to packed FP8 data using bias terms from the identified source/destination operand and store the packed FP8 data into corresponding data element positions of the identified source/destination operand.

5.

发明授权
Systems and methods for performing 16-bit floating-point matrix dot product instructions 有权

公开(公告)号：US11614936B2

公开(公告)日：2023-03-28

申请号：US17216566

申请日：2021-03-29

Applicant: Intel Corporation

Inventor： Alexander F. Heinecke , Robert Valentine , Mark J. Charney , Raanan Sade , Menachem Adelman , Zeev Sperber , Amit Gradstein , Simon Rubanovich

IPC: G06F9/30 , G06F9/38

Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (m, n) of the specified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the specified first source matrix by a corresponding nibble of a doubleword element (K,N) of the specified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element.

6.

发明申请
APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS FOR MOVING DATA BETWEEN TILES OF A MATRIX OPERATIONS ACCELERATOR AND VECTOR REGISTERS 有权

公开(公告)号：US20210406018A1

公开(公告)日：2021-12-30

申请号：US16914347

申请日：2020-06-27

Applicant: INTEL CORPORATION

Inventor： Menachem Adelman , Robert Valentine , Barukh Ziv , Yaroslav Pollak , Gideon Stupp , Amit Gradstein , Simon Rubanovich , Zeev Sperber , Mark Charney , Christopher Hughes , Alexander Heinecke

IPC: G06F9/30 , G06F17/16

Abstract: Systems, methods, and apparatuses relating to one or more instructions that utilize direct paths for loading data into a tile from a vector register and/or storing data from a tile into a vector register are described. In one embodiment, a system includes a matrix operations accelerator circuit comprising a two-dimensional grid of processing elements, a plurality of registers that represents a two-dimensional matrix coupled to the two-dimensional grid of processing elements, and a coupling to a cache; and a hardware processor core comprising: a vector register, a decoder to decode a single instruction into a decoded single instruction, the single instruction including a first field that identifies the two-dimensional matrix, a second field that identifies a set of elements of the two-dimensional matrix, and a third field that identifies the vector register, and an execution circuit to execute the decoded single instruction to cause a store of the set of elements from the plurality of registers that represents the two-dimensional matrix into the vector register by a coupling of the hardware processor core to the matrix operations accelerator circuit that is separate from the coupling to the cache.

7.

发明申请
LOADING AND STORING MATRIX DATA WITH DATATYPE CONVERSION 有权

公开(公告)号：US20210406012A1

公开(公告)日：2021-12-30

申请号：US16914317

申请日：2020-06-27

Applicant: Intel Corporation

Inventor： Menachem Adelman , Robert Valentine , Gideon Stupp , Yaroslav Pollak , Amit Gradstein , Simon Rubanovich , Zeev Sperber , Mark J. Charney , Christopher J. Hughes , Alexander F. Heinecke , Evangelos Georganas

IPC: G06F9/30 , G06F17/16

Abstract: Embodiments for loading and storing matrix data with datatype conversion are disclosed. In an embodiment, a processor includes a decoder and execution circuitry. The decoder is to decode an instruction having a format including an opcode field to specify an opcode, a first destination operand field to specify a first destination matrix location, and a first source operand field to specify a first source matrix location. The execution circuitry is to, in response to the decoded instruction, convert data elements from a plurality of source element locations of a first source matrix specified by the first source matrix location from a first datatype to a second datatype to generate a plurality of converted data elements and to store each of the plurality of converted data elements in one of a plurality of destination element locations in a first destination matrix specified by the first destination matrix location.

8.

发明授权
Systems and methods to load a tile register pair 有权

公开(公告)号：US11093247B2

公开(公告)日：2021-08-17

申请号：US15858932

申请日：2017-12-29

Applicant: Intel Corporation

Inventor： Raanan Sade , Simon Rubanovich , Amit Gradstein , Zeev Sperber , Alexander Heinecke , Robert Valentine , Mark J. Charney , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall , Menachem Adelman

IPC: G06F15/00 , G06F9/30

Abstract: Embodiments detailed herein relate to systems and methods to load a tile register pair. In one example, a processor includes: decode circuitry to decode a load matrix pair instruction having fields for an opcode and source and destination identifiers to identify source and destination matrices, respectively, each matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded load matrix pair instruction to load every element of left and right tiles of the identified destination matrix from corresponding element positions of left and right tiles of the identified source matrix, respectively, wherein the executing operates on one row of the identified destination matrix at a time, starting with the first row.

9.

发明授权
Systems and methods for performing instructions to convert to 16-bit floating-point format 有权

公开(公告)号：US11068262B2

公开(公告)日：2021-07-20

申请号：US17133078

申请日：2020-12-23

Applicant: Intel Corporation

Inventor： Alexander F. Heinecke , Robert Valentine , Mark J. Charney , Raanan Sade , Menachem Adelman , Zeev Sperber , Amit Gradstein , Simon Rubanovich

IPC: G06F9/30 , G06F9/38

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to convert to 16-bit floating-point format. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of a first source vector comprising N single-precision elements, and a destination vector comprising at least N 16-bit floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the specified source vector to 16-bit floating-point, the conversion to include truncation and rounding, as necessary, and to store each converted element into a corresponding location of the specified destination vector, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.

10.

发明申请
SYSTEMS AND METHODS TO STORE A TILE REGISTER PAIR TO MEMORY 审中-公开

公开(公告)号：US20190042255A1

公开(公告)日：2019-02-07

申请号：US15858937

申请日：2017-12-29

Applicant: Intel Corporation

Inventor： Raanan Sade , Simon Rubanovich , Amit Gradstein , Zeev Sperber , Alexander Heinecke , Robert Valentine , Mark J. Charney , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall , Menachem Adelman

IPC: G06F9/30

Abstract: Embodiments detailed herein relate to systems and methods to store a tile register pair to memory. In one example, a processor includes: decode circuitry to decode a store matrix pair instruction having fields for an opcode and source and destination identifiers to identify source and destination matrices, respectively, each matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded store matrix pair instruction to store every element of left and right tiles of the identified source matrix to corresponding element positions of left and right tiles of the identified destination matrix, respectively, wherein the executing stores a chunk of C elements of one row of the identified source matrix at a time.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification