-
公开(公告)号:US12277234B2
公开(公告)日:2025-04-15
申请号:US17791000
申请日:2020-12-26
Applicant: Intel Corporation
Inventor: David M. Durham , Michael D. LeMay , Salmin Sultana , Karanvir S. Grewal , Michael E. Kounavis , Sergej Deutsch , Andrew James Weiler , Abhishek Basak , Dan Baum , Santosh Ghosh
Abstract: A processor, a system, a machine readable medium, and a method. The processor comprises first circuitry to: encrypt a first code image using a first code key; load the encrypted first code image into a memory area allocated in memory for the first code image by an operating system miming on the processor; and send to the operating system a substitute key that corresponds to the first code key, wherein the first code key is concealed from the operating system; and an instruction cache including control circuitry; and second circuitry coupled to the instruction cache, the second circuitry to: receive the substitute key from the operating system; in response to a first request from the operating system to execute the first code image to instantiate a first process, perform a first cryptographic function using a hardware key to generate the first code key from the substitute key; and program the control circuitry of the instruction cache with the first code key to enable the first code image to be decrypted using the first code key.
-
公开(公告)号:US20230409326A1
公开(公告)日:2023-12-21
申请号:US17841558
申请日:2022-06-15
Applicant: Intel Corporation
Inventor: Menachem Adelman , Amit Gradstein , Simon Rubanovich , Barukh Ziv , Uri Sherman , Dana Rip , Shahar Mizrahi , Dan Baum , Rinat Rappoport , Nilesh Jain , Zeev Sperber , Gideon Stupp , Alexander Heinecke , Christopher Hughes , Evangelos Georganas
CPC classification number: G06F9/30145 , G06F9/30178 , G06F9/30047 , G06F9/3887 , G06N3/04
Abstract: Techniques and mechanisms for processor circuitry to execute a load and expand instruction of an instruction set to generate decompressed matrix data. In an embodiment, the instruction comprises a source operand which indicates a location from which compressed matrix data, and corresponding metadata, are to be accessed. A destination operand of the instruction indicates a location which is to receive decompressed metadata, which is generated, during execution of the instruction, based on the compressed matrix data and the corresponding metadata. The metadata comprises compression mask information which identifies which elements of the matrix have been masked from the compressed matrix data. In another embodiment, the instruction further comprises a count operand which identifies a total number of the unmasked matrix elements which are represented in the compressed matrix data.
-
公开(公告)号:US11847452B2
公开(公告)日:2023-12-19
申请号:US17360562
申请日:2021-06-28
Applicant: Intel Corporation
Inventor: Menachem Adelman , Robert Valentine , Zeev Sperber , Mark J. Charney , Bret L. Toll , Rinat Rappoport , Jesus Corbal , Dan Baum , Alexander F. Heinecke , Elmoustapha Ould-Ahmed-Vall , Yuri Gebil , Raanan Sade
CPC classification number: G06F9/30036 , G06F7/485 , G06F7/4876 , G06F7/762 , G06F9/3001 , G06F9/3016 , G06F9/30032 , G06F9/30043 , G06F9/30109 , G06F9/30112 , G06F9/30134 , G06F9/30145 , G06F9/30149 , G06F9/30185 , G06F9/30196 , G06F9/3818 , G06F9/3836 , G06F17/16 , G06F2212/454
Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.
-
公开(公告)号:US11847185B2
公开(公告)日:2023-12-19
申请号:US17485055
申请日:2021-09-24
Applicant: Intel Corporation
Inventor: Dan Baum , Chen Koren , Elmoustapha Ould-Ahmed-Vall , Michael Espig , Christopher J. Hughes , Raanan Sade , Robert Valentine , Mark J. Charney , Alexander F. Heinecke
CPC classification number: G06F17/16 , G06F9/3001 , G06F9/3016 , G06F9/30101 , G06F9/3802
Abstract: Disclosed embodiments relate to accelerating multiplication of sparse matrices. In one example, a processor is to fetch and decode an instruction having fields to specify locations of first, second, and third matrices, and an opcode indicating the processor is to multiply and accumulate matching non-zero (NZ) elements of the first and second matrices with corresponding elements of the third matrix, and executing the decoded instruction as per the opcode to generate NZ bitmasks for the first and second matrices, broadcast up to two NZ elements at a time from each row of the first matrix and each column of the second matrix to a processing engine (PE) grid, each PE to multiply and accumulate matching NZ elements of the first and second matrices with corresponding elements of the third matrix. Each PE further to store an NZ element for use in a subsequent multiplications.
-
公开(公告)号:US20230027329A1
公开(公告)日:2023-01-26
申请号:US17791000
申请日:2020-12-26
Applicant: Intel Corporation
Inventor: David M. Durham , Michael D. LeMay , Salmin Sultana , Karanvir S. Grewal , Michael E. Kounavis , Sergej Deutsch , Andrew James Weiler , Abhishek Basak , Dan Baum , Santosh Ghosh
Abstract: A processor, a system, a machine readable medium, and a method. The processor comprises first circuitry to: encrypt a first code image using a first code key; load the encrypted first code image into a memory area allocated in memory for the first code image by an operating system miming on the processor; and send to the operating system a substitute key that corresponds to the first code key, wherein the first code key is concealed from the operating system; and an instruction cache including control circuitry; and second circuitry coupled to the instruction cache, the second circuitry to: receive the substitute key from the operating system; in response to a first request from the operating system to execute the first code image to instantiate a first process, perform a first cryptographic function using a hardware key to generate the first code key from the substitute key; and program the control circuitry of the instruction cache with the first code key to enable the first code image to be decrypted using the first code key.
-
公开(公告)号:US11288069B2
公开(公告)日:2022-03-29
申请号:US16487755
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Robert Valentine , Menachem Adelman , Elmoustapha Ould-Ahmed-Vall , Bret L. Toll , Milind B. Girkar , Zeev Sperber , Mark J. Charney , Rinat Rappoport , Jesus Corbal , Stanislav Shwartsman , Igor Yanover , Alexander F. Heinecke , Barukh Ziv , Dan Baum , Yuri Gebil
Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in at least a form of decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and destination memory information, and execution circuitry to execute the decoded instruction to store each data element of configured rows of the identified source matrix operand to memory based on the destination memory information
-
公开(公告)号:US11288068B2
公开(公告)日:2022-03-29
申请号:US16487747
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Robert Valentine , Zeev Sperber , Mark J. Charney , Bret L. Toll , Jesus Corbal , Dan Baum , Alexander Heinecke , Elmoustapha Ould-Ahmed-Vall
Abstract: Detailed herein are embodiment systems, processors, and methods for matrix move. For example, a processor comprising decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to move each data element of the identified source matrix operand to corresponding data element position of the identified destination matrix operand is described.
-
公开(公告)号:US11200055B2
公开(公告)日:2021-12-14
申请号:US16474507
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Robert Valentine , Dan Baum , Zeev Sperber , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall , Bret L. Toll , Mark J. Charney , Barukh Ziv , Alexander Heinecke , Milind Girkar , Simon Rubanovich
Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for matrix (tile) addition, subtraction, and multiplication is described. For example, circuitry to support instructions for element-by-element matrix (tile) addition, subtraction, and multiplication are detailed. In some embodiments, for matrix (tile) addition, decode circuitry is to decode an instruction having fields for an opcode, a first source matrix operand identifier, a second source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry is to execute the decoded instruction to, for each data element position of the identified first source matrix operand: add a first data value at that data element position to a second data value at a corresponding data element position of the identified second source matrix operand, and store a result of the addition into a corresponding data element position of the identified destination matrix operand.
-
公开(公告)号:US11086623B2
公开(公告)日:2021-08-10
申请号:US16487787
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Robert Valentine , Zeev Sperber , Mark J. Charney , Bret L. Toll , Rinat Rappoport , Stanislav Shwartsman , Dan Baum , Igor Yanover , Elmoustapha Ould-Ahmed-Vall , Menachem Adelman , Jesus Corbal , Yuri Gebil , Simon Rubanovich
Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.
-
公开(公告)号:US12287843B2
公开(公告)日:2025-04-29
申请号:US18502291
申请日:2023-11-06
Applicant: Intel Corporation
Inventor: Dan Baum , Chen Koren , Elmoustapha Ould-Ahmed-Vall , Michael Espig , Christopher J. Hughes , Raanan Sade , Robert Valentine , Mark J. Charney , Alexander F. Heinecke
Abstract: Disclosed embodiments relate to accelerating multiplication of sparse matrices. In one example, a processor is to fetch and decode an instruction having fields to specify locations of first, second, and third matrices, and an opcode indicating the processor is to multiply and accumulate matching non-zero (NZ) elements of the first and second matrices with corresponding elements of the third matrix, and executing the decoded instruction as per the opcode to generate NZ bitmasks for the first and second matrices, broadcast up to two NZ elements at a time from each row of the first matrix and each column of the second matrix to a processing engine (PE) grid, each PE to multiply and accumulate matching NZ elements of the first and second matrices with corresponding elements of the third matrix. Each PE further to store an NZ element for use in a subsequent multiplications.
-
-
-
-
-
-
-
-
-