专利检索 ap:("INTEL Corporation") AND inv:"HEINECKE, Alexander F." 第 1 页

1.

发明公开
OPTIMIZED COMPUTE HARDWARE FOR MACHINE LEARNING OPERATIONS 审中-公开

公开(公告)号：EP3783479A1

公开(公告)日：2021-02-24

申请号：EP20200955.1

申请日：2018-04-30

申请人： INTEL Corporation

发明人： DAS, Dipankar , GRAMUNT, Roger , SMELYANSKIY, Mikhail , CORBAL, Jesus , MUDIGERE, Dheevatsa , MELLEMPUDI, Naveen K. , HEINECKE, Alexander F.

IPC分类号： G06F9/30

摘要： One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a fetch unit to fetch a single instruction having multiple input operands, wherein the multiple input operands have an unequal bit-length, a first input operand having a first bit-length and a second input operand having a second bit-length; a decode unit to decode the single instruction into a decoded instruction; an operand length unit to determine a smaller bit-length of the first bit-length and the second bit-length; and a compute unit to perform a matrix operation on the multiple input operands to generate an output value having a bit length of the smaller bit length.

2.

发明公开
SYSTEMS AND METHODS FOR PERFORMING HORIZONTAL TILE OPERATIONS 审中-公开

公开(公告)号：EP3623940A2

公开(公告)日：2020-03-18

申请号：EP19183497.7

申请日：2019-06-28

申请人： Intel Corporation

发明人： HUGHES, Christopher J. , TOLL, Bret , BAUM, Dan , OULD-AHMED-VALL, Elmoustapha , SADE, Raanan , VALENTINE, Robert , CHARNEY, Mark J. , HEINECKE, Alexander F.

IPC分类号： G06F9/30 , G06F9/38

摘要： Disclosed embodiments relate to systems and methods for performing instructions specifying horizontal tile operations. In one example, a processor includes fetch circuitry to fetch an instruction specifying a horizontal tile operation, a location of a M by N source matrix comprising K groups of elements, and locations of K destinations, wherein each of the K groups of elements comprises the same number of elements, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction by generating K results, each result being generated by performing the specified horizontal tile operation across every element of a corresponding group of the K groups, and writing each generated result to a corresponding location of the K specified destination locations.

3.

发明公开
SYSTEMS FOR PERFORMING INSTRUCTIONS TO QUICKLY CONVERT AND USE TILES AS 1D VECTORS 审中-公开

公开(公告)号：EP4141661A1

公开(公告)日：2023-03-01

申请号：EP22200756.9

申请日：2019-06-26

申请人： Intel Corporation

发明人： TOLL, Bret , HUGHES, Christopher J. , BAUM, Dan , OULD-AHMED-VALL, Elmoustapha , SADE, Raanan , VALENTINE, Robert , CHARNEY, Mark J. , HEINECKE, Alexander F.

IPC分类号： G06F9/30

摘要： Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, an apparatus comprises a configuration storage to store configuration information for a two-dimensional (2D) matrix storage, the configuration information to include a first value indicative of a number of rows of the 2D matrix storage and a second value indicative of a number of columns of the 2D matrix storage, fetch circuitry to fetch an instruction, the instruction to specify the 2D matrix storage, a row of the 2D matrix storage, and a 512-bit vector register, decode circuitry, coupled with the fetch circuitry, to decode the instruction, and execution circuitry, coupled with the decode circuitry, to perform operations corresponding to the instruction, including to store the row of the 2D matrix storage to the 512-bit vector register.

4.

发明公开
DEEP LEARNING IMPLEMENTATIONS USING SYSTOLIC ARRAYS AND FUSED OPERATIONS 审中-公开

公开(公告)号：EP3798928A1

公开(公告)日：2021-03-31

申请号：EP20179527.5

申请日：2020-06-11

申请人： INTEL Corporation

发明人： RASH, William , MAIYURAN, Subramaniam , GEORGE, Varghese , TOLL, Bret , SANKARAN, Rajesh , CHAPPELL, Robert , PAL, Supratim , HEINECKE, Alexander F. , OULD-AHMED-VALL, Elmoustapha , CHEN, Gang

IPC分类号： G06N3/063 , G06N3/04 , G06F9/30 , G06F17/16

摘要： Disclosed embodiments relate to deep learning implementations using systolic arrays and fused operations. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction having fields to specify an opcode and locations of a destination and N source matrices, the opcode indicating the processor is to load the N source matrices from memory, perform N convolutions on the N source matrices to generate N feature maps, and store results of the N convolutions in registers to be passed to an activation layer, wherein the processor is to perform the N convolutions and the activation layer with at most one memory load of each of the N source matrices. The processor further includes scheduling circuitry to schedule execution of the instruction and execution circuitry to execute the instruction as per the opcode.

5.

发明公开
SYSTEMS FOR PERFORMING INSTRUCTIONS TO QUICKLY CONVERT AND USE TILES AS 1D VECTORS 审中-公开

公开(公告)号：EP3629154A3

公开(公告)日：2020-05-06

申请号：EP19182737.7

申请日：2019-06-26

申请人： INTEL Corporation

发明人： TOLL, Bret , HUGHES, Christopher J. , BAUM, Dan , OULD-AHMED-VALL, Elmoustapha , SADE, Raanan , VALENTINE, Robert , CHARNEY, Mark J. , HEINECKE, Alexander F.

IPC分类号： G06F9/30

摘要： Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.

6.

发明公开
SYSTEMS AND METHODS FOR IMPLEMENTING CHAINED TILE OPERATIONS 审中-公开

公开(公告)号：EP3547120A1

公开(公告)日：2019-10-02

申请号：EP19157043.1

申请日：2019-02-13

申请人： INTEL Corporation

发明人： HUGHES, Christopher J. , HEINECKE, Alexander F. , VALENTINE, Robert , TOLL, Bret , CORBAL, Jesus , OULD-AHMED-VALL, Elmoustapha

IPC分类号： G06F9/38 , G06F15/78 , G06F9/30

摘要： Disclosed embodiments relate to systems and methods for implementing chained tile operations. In one example, a processor includes fetch circuitry to fetch one or more instructions until a plurality of instructions has been fetched, each instruction to specify source and destination tile operands, decode circuitry to decode the fetched instructions, and execution circuitry, responsive to the decoded instructions, to: identify first and second decoded instructions belonging to a chain of instructions, dynamically select and configure a SIMD path comprising first and second processing engines (PE) to execute the first and second decoded instructions, and set aside the specified destination of the first decoded instruction, and instead route a result of the first decoded instruction from the first PE to be used by the second PE to perform the second decoded instruction.

7.

发明公开
OPTIMIZED COMPUTE HARDWARE FOR MACHINE LEARNING OPERATIONS 审中-公开

公开(公告)号：EP3407183A2

公开(公告)日：2018-11-28

申请号：EP18170154.1

申请日：2018-04-30

申请人： INTEL Corporation

发明人： DAS, Dipankar , GRAMUNT, Roger , SMELYANSKIY, Mikhail , CORBAL, Jesus , MUDIGERE, Dheevatsa , MELLEMPUDI, Naveen K. , HEINECKE, Alexander F.

IPC分类号： G06F9/30

CPC分类号： G06F9/3887 , G06F9/30014 , G06F9/30036 , G06F9/3016 , G06F9/30181 , G06F9/30192 , G06F9/3851 , G06N3/00 , G06T1/20

摘要： One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a fetch unit to fetch a single instruction having multiple input operands, wherein the multiple input operands have an unequal bit-length, a first input operand having a first bit-length and a second input operand having a second bit-length; a decode unit to decode the single instruction into a decoded instruction; an operand length unit to determine a smaller bit-length of the first bit-length and the second bit-length; and a compute unit to perform a matrix operation on the multiple input operands to generate an output value having a bit length of the smaller bit length.

8.

发明公开
HARDWARE APPARATUSES AND METHODS TO PREFETCH A MULTIDIMENSIONAL BLOCK OF ELEMENTS FROM A MULTIMENSIONAL ARRAY 审中-公开
标题翻译：硬件设备和方法来预测多维阵列中的多维元素

公开(公告)号：EP3238072A1

公开(公告)日：2017-11-01

申请号：EP15874043.1

申请日：2015-11-25

申请人： Intel Corporation

发明人： LEE, Victor W. , SMELYANSKIY, Mikhail , HEINECKE, Alexander F.

IPC分类号： G06F12/08 , G06F9/30

摘要： Methods and apparatuses relating to a prefetch instruction to prefetch a multidimensional block of elements from a multidimensional array into a cache. In one embodiment, a hardware processor includes a decoder to decode a prefetch instruction to prefetch a multidimensional block of elements from a multidimensional array into a cache, wherein at least one operand of the prefetch instruction is to indicate a system memory address of an element of the multidimensional block of elements, a stride of the multidimensional block of elements, and boundaries of the multidimensional block of elements, and an execution unit to execute the prefetch instruction to generate system memory addresses of the other elements of the multidimensional block of elements, and load the multidimensional block of elements into the cache from the system memory addresses.

摘要翻译： 涉及预取指令以将多维数组的多维块从多维数组预取到高速缓存中的方法和设备。在一个实施例中，硬件处理器包括解码器，用于对预取指令进行解码以将多维元素的多维块预取到高速缓存中，其中预取指令的至少一个操作数用于指示元素的系统存储器地址所述多维元素块，所述多维元素块的步幅和所述多维元素块的边界，以及执行单元，用于执行所述预取指令以生成所述多维元素块的其他元素的系统存储器地址，以及将多维元素块从系统内存地址加载到高速缓存中。

9.

发明公开
SYSTEMS FOR PERFORMING INSTRUCTIONS TO QUICKLY CONVERT AND USE TILES AS 1D VECTORS 审中-公开

公开(公告)号：EP4177738A1

公开(公告)日：2023-05-10

申请号：EP22217001.1

申请日：2019-06-26

申请人： INTEL Corporation

发明人： TOLL, Bret , HUGHES, Christopher J. , BAUM, Dan , OULD-AHMED-VALL, Elmoustapha , SADE, Raanan , VALENTINE, Robert , CHARNEY, Mark J. , HEINECKE, Alexander F.

IPC分类号： G06F9/30

摘要： Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor comprises fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional, 2D, matrix and a one-dimensional, 1D, vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, or a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector; decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.

10.

发明公开
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS SPECIFYING TERNARY TILE LOGIC OPERATIONS 审中-公开

公开(公告)号：EP3623941A3

公开(公告)日：2020-05-06

申请号：EP19183501.6

申请日：2019-06-28

申请人： INTEL Corporation

发明人： OULD-AHMED-VALL, Elmoustapha , HUGHES, Christopher J. , TOLL, Bret , BAUM, Dan , SADE, Raanan , VALENTINE, Robert , CHARNEY, Mark J. , HEINECKE, Alexander F.

IPC分类号： G06F9/30 , G06F9/38 , G06F17/16

摘要： Disclosed embodiments relate to systems and methods for performing instructions specifying ternary tile operations. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction specifying a ternary tile operation, and locations of destination and first, second, and third source matrices, each of the matrices having M rows by N columns; and execution circuitry to respond to the decoded instruction by, for each equal-sized group of K elements of the specified first, second, and third source matrices, generate K results by performing the ternary tile operation in parallel on K corresponding elements of the specified first, second, and third source matrices, and store each of the K results to a corresponding element of the specified destination matrix, wherein corresponding elements of the specified source and destination matrices occupy a same relative position within their associated matrix.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类