Patent search ap:("Intel Corporation") AND inv:"Valentine Page Robert"

51.

发明公开
SYSTEMS FOR PERFORMING INSTRUCTIONS TO QUICKLY CONVERT AND USE TILES AS 1D VECTORS 审中-实审

公开(公告)号：EP4361802A2

公开(公告)日：2024-05-01

申请号：EP24157718.8

申请日：2019-06-26

Applicant: Intel Corporation

Inventor： Toll, Bret , Hughes, Christopher J. , Baum, Dan , Ould-Ahmed-Vall, ElMoustapha , Sade, Raanan , Valentine, Robert , Charney, Mark J. , Heinecke, Alexander F.

IPC: G06F9/30

CPC classification number: G06F9/30036 , G06F9/30032

Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor comprises decode circuitry to decode an instruction, the instruction to specify a two-dimensional, 2D, tile storage of the processor, either of multiple rows or multiple columns of the 2D tile storage, multiple vector registers of the processor, and a size of elements of the multiple rows or multiple columns of the 2D tile storage as any one of 8-bits, 16-bits, 32-bits, and 64-bits; and execution circuitry, coupled with the decode circuitry, to perform operations corresponding to the instruction, the operations to include storing elements from each row of the multiple rows or each column of the multiple columns of the 2D tile storage to a corresponding one of the multiple vector registers as a corresponding one-dimensional, 1D, vector.

52.

发明公开
SYSTEMS, METHODS, AND APPARATUSES FOR DOT PRODUCTION OPERATIONS 审中-公开

公开(公告)号：EP4303724A1

公开(公告)日：2024-01-10

申请号：EP23194771.4

申请日：2017-07-01

Applicant: INTEL Corporation

Inventor： Valentine, Robert , Baum, Dan , Sperber, Zeev , Corbal, Jesus , Ould-Ahmed-Vall, ElMoustapha , Toll, Bret L. , Charney, Mark J. , Adelman, Menachem , Ziv, Barukh , Heinecke, Alexander , Rubanovich, Simon

IPC: G06F9/30

Abstract: Embodiments detailed herein relate to matrix operations. For example, a processor comprises decode circuitry to decode a single matrix instruction and execution circuitry to execute the single matrix instruction. The single matrix instruction has fields for an opcode, a plurality of identifiers corresponding to a first plurality of 4-bit sized data elements of a first source matrix, a second plurality of 4-bit sized data elements of a second source matrix, a plurality of doubleword- sized source data elements of a third source matrix, and a plurality of doubleword-sized result data elements of a result matrix, and bits indicating whether one or both of the first and second plurality of 4-bit sized data elements are signed or unsigned. The execution circuitry includes a multiply accumulate circuit, comprising: a multiplier to multiply each 4-bit sized data element of a first subset of the first plurality of 4-bit sized data elements with a corresponding 4-bit sized data element of a first subset of the second plurality of 4-bit sized data elements to generate a plurality of products; and an accumulator to add the plurality of products to a corresponding doubleword-sized source data element of the plurality of doubleword-sized source data elements to generate a corresponding doubleword-sized result data element of the plurality of doubleword-sized result data elements.

53.

发明公开
SYSTEMS, METHODS, AND APPARATUSES FOR TILE MATRIX MULTIPLICATION AND ACCUMULATION 审中-公开

公开(公告)号：EP4216057A1

公开(公告)日：2023-07-26

申请号：EP23161367.0

申请日：2017-07-01

Applicant: INTEL Corporation

Inventor： Valentine, Robert , Sperber, Zeev , Charney, Mark J. , Toll, Bret L. , Rappoport, Rinat , Shwartsman, Stanislav , Baum, Dan , Yanover, Igor , Ould-Ahmed-Vall, ElMoustapha , Adelman, Menachem , Corbal, Jesus , Gebil, Yuri , Rubanovich, Simon

IPC: G06F9/30

Abstract: Embodiments detailed herein relate to matrix operations. For example, in some embodiments, an apparatus comprises an instruction decoder to decode a single instruction, the single instruction having fields to indicate an opcode, a first register to store a first source matrix, a second register to store a second source matrix, and a third register to store a 2 by 2 third source matrix, wherein the opcode is to indicate a matrix multiply-accumulate operation; and execution circuitry to perform the matrix multiply-accumulate operation. The matrix multiply-accumulate operation includes: multiplying a value corresponding to a first row and a first column of the first source matrix and a value corresponding to a first row and a first column of the second source matrix to generate a first product, multiplying a value corresponding to the first row and a second column of the first source matrix and a value corresponding to a second row and the first column of the second source matrix to generate a second product, summing the first product, the second product, and an initial value corresponding to an element position in a first row and a first column of the 2 by 2 third source matrix to generate a resulting value corresponding to the element position in a destination matrix, and storing the destination matrix in the third register.

54.

发明公开
BFLOAT16 SCALE AND/OR REDUCE INSTRUCTIONS 审中-公开

公开(公告)号：EP4141656A1

公开(公告)日：2023-03-01

申请号：EP22185939.0

申请日：2022-07-20

Applicant: INTEL Corporation

Inventor： Adelman, Menachem , Heinecke, Alexander , Valentine, Robert , Sperber, Zeev , Gradstein, Amit , Charney, Mark , Georganas, Evangelos , Kalamkar, Dhiraj , Hughes, Christopher , Anderson, Cristina

IPC: G06F9/30

Abstract: Techniques for scale and reduction of BF16 data elements are described. An exemplary instruction includes fields for an having fields for an opcode, an identification of a location of a first packed data source operand, an identification of a location of a second packed data source operand, and an identification of a packed data destination operand, wherein the opcode is to indicate that execution circuitry is to perform, for each data element position of the packed data source operands, a floating point scale operation of a BF16 data element of the first packed data source by multiplying the data element by a power of 2 value, wherein a value of the exponent of the power of 2 value is a floor value of a BF16 data element of the second packed data source, and store a result of the floating point scale operation into a corresponding data element position of the packed data destination operand.

55.

发明公开
SYSTEMS, METHODS, AND APPARATUSES FOR MATRIX ADD, SUBTRACT, AND MULTIPLY 审中-公开

公开(公告)号：EP4137941A1

公开(公告)日：2023-02-22

申请号：EP22196776.3

申请日：2017-07-01

Applicant: Intel Corporation

Inventor： Valentine, Robert , Baum, Dan , Sperber, Zeev , Corbal, Jesus , Ould-Ahmed-Vall, Elmoustapha , Toll, Bret L. , Charney, Mark J. , Ziv, Barukh , Heinecke, Alexander , Girkar, Milind , Rubanovich, Simon

IPC: G06F9/30 , G06F7/544 , G06F9/345 , G06F9/38 , G06F12/02

Abstract: Embodiments detailed herein relate to matrix operations. For example, in some embodiments, a processor comprises decode circuitry to decode an instruction having fields for an opcode, a first source matrix operand identifier, a second source matrix operand identifier, and a destination matrix operand identifier, wherein each of the first source matrix operand, the second source matrix operand, and the destination matrix operand corresponds to a two-dimensional matrix of values, and execution circuitry to execute the decoded instruction to, for each data element position of the identified first source matrix operand: multiply a first data value at that data element position by a second data value at a corresponding data element position of the identified second source matrix operand, and store a result of the multiplication into a corresponding data element position of the identified destination matrix operand.

56.

发明公开
APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS TO CONVERT 16-BIT FLOATING-POINT FORMATS 审中-公开

公开(公告)号：EP3974967A1

公开(公告)日：2022-03-30

申请号：EP21192634.0

申请日：2021-08-23

Applicant: INTEL Corporation

Inventor： Heinecke, Alexander F , Valentine, Robert , Charney, Mark J , Adelman, Menachem , Hughes, Christopher J , Georganas, Evangelos , Sperber, Zeev , Gradstein, Amit , Rubanovich, Simon

IPC: G06F9/30

Abstract: Systems, methods, and apparatuses relating to instructions to convert 16-bit floating-point formats are described. In one embodiment, a processor includes fetch circuitry to fetch a single instruction having fields to specify an opcode and locations of a source vector comprising N plurality of 16-bit half-precision floating-point elements, and a destination vector to store N plurality of 16-bit bfloat floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the source vector from 16-bit half-precision floating-point format to 16-bit bfloat floating-point format and store each converted element into a corresponding location of the destination vector, decode circuitry to decode the fetched single instruction into a decoded single instruction, and the execution circuitry to respond to the decoded single instruction as specified by the opcode.

57.

发明公开
APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS FOR MOVING DATA BETWEEN TILES OF A MATRIX OPERATIONS ACCELERATOR AND VECTOR REGISTERS 审中-公开

公开(公告)号：EP3929736A1

公开(公告)日：2021-12-29

申请号：EP20214433.3

申请日：2020-12-16

Applicant: INTEL Corporation

Inventor： Adelman, Menachem , Valentine, Robert , Ziv, Barukh , Pollak, Yaroslav , Stupp, Gideon , Gradstein, Amit , Rubanovich, Simon , Sperber, Zeev , Charney, Mark , Hughes, Christopher , Heinecke, Alexander

IPC: G06F9/30 , G06F17/16 , G06F9/38

Abstract: Systems, methods, and apparatuses relating to one or more instructions that utilize direct paths for loading data into a tile from a vector register and/or storing data from a tile into a vector register are described. In one embodiment, a system includes a matrix operations accelerator circuit comprising a two-dimensional grid of processing elements, a plurality of registers that represents a two-dimensional matrix coupled to the two-dimensional grid of processing elements, and a coupling to a cache; and a hardware processor core comprising: a vector register, a decoder to decode a single instruction into a decoded single instruction, the single instruction including a first field that identifies the two-dimensional matrix, a second field that identifies a set of elements of the two-dimensional matrix, and a third field that identifies the vector register, and an execution circuit to execute the decoded single instruction to cause a store of the set of elements from the plurality of registers that represents the two-dimensional matrix into the vector register by a coupling of the hardware processor core to the matrix operations accelerator circuit that is separate from the coupling to the cache.

58.

发明公开
MATRIX TRANSPOSE AND MULTIPLY 审中-公开

公开(公告)号：EP3929733A1

公开(公告)日：2021-12-29

申请号：EP20209948.7

申请日：2020-11-26

Applicant: INTEL Corporation

Inventor： Adelman, Menachem , Valentine, Robert , Ziv, Barukh , Gradstein, Amit , Rubanovich, Simon , Sperber, Zeev , Charney, Mark J. , Hughes, Christopher J. , Heinecke, Alexander F. , Georganas, Evangelos , Pham, Binh

IPC: G06F9/30

Abstract: Embodiments for a matrix transpose and multiply operation are disclosed. In an embodiment, a processor includes a decoder and execution circuitry. The decoder is to decode an instruction having a format including an opcode field to specify an opcode, a first destination operand field to specify a destination matrix location, a first source operand field to specify a first source matrix location, and a second source operand field to specify a second source matrix location. The execution circuitry is to, in response to the decoded instruction, transpose the first source matrix to generate a transposed first source matrix, perform a matrix multiplication using the transposed first source matrix and the second source matrix to generate a result, and store the result in a destination matrix location.

59.

发明公开
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS TO TRANSFORM MATRICES INTO ROW-INTERLEAVED FORMAT 审中-公开

公开(公告)号：EP3916543A3

公开(公告)日：2021-12-22

申请号：EP21187080.3

申请日：2019-06-27

Applicant: INTEL Corporation

Inventor： Sade, Raanan , Valentine, Robert , Toll, Bret , Hughes, Christopher J. , Heinecke, Alexander F. , Ould-Ahmed-Vall, ElMoustapha , Charney, Mark J.

IPC: G06F9/30

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to transform matrices into a row-interleaved format. In one example, a processor comprises decode circuitry to decode a single instruction into a decoded single instruction and execution circuitry to execute the decoded single instruction according to an opcode. The single instruction has a first field to specify a source matrix, a second field to specify a destination matrix, and the opcode to indicate the execution circuitry is to cause a store of: a first element and a second element from a first column of the source matrix respectively into a first element and a second element in a first row of the destination matrix, a first element and a second element from a second column of the source matrix respectively into a third element and a fourth element in the first row of the destination matrix, a third element and a fourth element from the first column of the source matrix respectively into a first element and a second element in a second row of the destination matrix, and a third element and a fourth element from the second column of the source matrix respectively into a third element and a fourth element in the second row of the destination matrix.

60.

发明公开
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS TO CONVERT TO 16-BIT FLOATING-POINT FORMAT 审中-公开

公开(公告)号：EP3822774A1

公开(公告)日：2021-05-19

申请号：EP20216494.3

申请日：2019-10-08

Applicant: INTEL Corporation

Inventor： Heinecke, Alexander F. , Valentine, Robert , Charney, Mark J. , Sade, Raanan , Adelman, Menachem , Sperber, Zeev , Gradstein, Amit , Rubanovich, Simon

IPC: G06F9/30

Abstract: Disclosed embodiments relate to a processor, a system on a chip and a system for executing a format conversion instruction. In one example, a processor having a plurality of cores, including a core that, in response to a format conversion instruction having a first source operand including a first 32-bit single-precision floating point data element, and a second source operand including a second 32-bit single-precision floating point data element, is to: convert the first 32-bit single-precision floating point data element to a first 16-bit floating point data element, wherein, when the first 32-bit single-precision floating point data element is a normal data element, conversion is to be performed according to a rounding mode specified by the format conversion instruction, and the first 16-bit floating point data element is to have a sign bit, an 8-bit exponent, seven explicit mantissa bits, and one implicit mantissa bit, and wherein, when the first 32-bit single-precision floating point data element is a not-a-number, NaN, data element, the first 16-bit floating point data element is to have a mantissa with a most significant bit set to one; convert the second 32-bit single-precision floating point data element to a second 16-bit floating point data element, wherein, when the second 32-bit single-precision floating point data element is a normal data element, conversion is to be performed according to the rounding mode, and the second 16-bit floating point data element is to have a sign bit, an 8-bit exponent, seven explicit mantissa bits, and one implicit mantissa bit, and wherein when the second 32-bit single-precision floating point data element is a NaN data element, the second 16-bit floating point data element is to have a mantissa with a most significant bit set to one; and store the first 16-bit floating point data element in a lower order half of a destination register and the second 16-bit floating point data element in a higher order half of the destination register..

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification