-
公开(公告)号:US11915000B2
公开(公告)日:2024-02-27
申请号:US18160600
申请日:2023-01-27
Applicant: Intel Corporation
Inventor: Ahmad Yasin , Raanan Sade , Liron Zur , Igor Yanover , Joseph Nuzman
CPC classification number: G06F9/30145 , G06F9/30098 , G06F9/544 , G06F9/546 , G06F11/3037 , G06F11/348
Abstract: Systems, methods, and apparatuses relating to circuitry to precisely monitor memory store accesses are described. In one embodiment, a system includes a memory, a hardware processor core comprising a decoder to decode an instruction into a decoded instruction, an execution circuit to execute the decoded instruction to produce a resultant, a store buffer, and a retirement circuit to retire the instruction when a store request for the resultant from the execution circuit is queued into the store buffer for storage into the memory, and a performance monitoring circuit to mark the retired instruction for monitoring of post-retirement performance information between being queued in the store buffer and being stored in the memory, enable a store fence after the retired instruction to be inserted that causes previous store requests to complete within the memory, and on detection of completion of the store request for the instruction in the memory, store the post-retirement performance information in storage of the performance monitoring circuit.
-
62.
公开(公告)号:US11893389B2
公开(公告)日:2024-02-06
申请号:US18190761
申请日:2023-03-27
Applicant: Intel Corporation
Inventor: Alexander F. Heinecke , Robert Valentine , Mark J. Charney , Raanan Sade , Menachem Adelman , Zeev Sperber , Amit Gradstein , Simon Rubanovich
CPC classification number: G06F9/30036 , G06F9/3001 , G06F9/3016 , G06F9/3802
Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (m, n) of the specified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the specified first source matrix by a corresponding nibble of a doubleword element (K,N) of the specified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element.
-
公开(公告)号:US11847452B2
公开(公告)日:2023-12-19
申请号:US17360562
申请日:2021-06-28
Applicant: Intel Corporation
Inventor: Menachem Adelman , Robert Valentine , Zeev Sperber , Mark J. Charney , Bret L. Toll , Rinat Rappoport , Jesus Corbal , Dan Baum , Alexander F. Heinecke , Elmoustapha Ould-Ahmed-Vall , Yuri Gebil , Raanan Sade
CPC classification number: G06F9/30036 , G06F7/485 , G06F7/4876 , G06F7/762 , G06F9/3001 , G06F9/3016 , G06F9/30032 , G06F9/30043 , G06F9/30109 , G06F9/30112 , G06F9/30134 , G06F9/30145 , G06F9/30149 , G06F9/30185 , G06F9/30196 , G06F9/3818 , G06F9/3836 , G06F17/16 , G06F2212/454
Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.
-
公开(公告)号:US11847185B2
公开(公告)日:2023-12-19
申请号:US17485055
申请日:2021-09-24
Applicant: Intel Corporation
Inventor: Dan Baum , Chen Koren , Elmoustapha Ould-Ahmed-Vall , Michael Espig , Christopher J. Hughes , Raanan Sade , Robert Valentine , Mark J. Charney , Alexander F. Heinecke
CPC classification number: G06F17/16 , G06F9/3001 , G06F9/3016 , G06F9/30101 , G06F9/3802
Abstract: Disclosed embodiments relate to accelerating multiplication of sparse matrices. In one example, a processor is to fetch and decode an instruction having fields to specify locations of first, second, and third matrices, and an opcode indicating the processor is to multiply and accumulate matching non-zero (NZ) elements of the first and second matrices with corresponding elements of the third matrix, and executing the decoded instruction as per the opcode to generate NZ bitmasks for the first and second matrices, broadcast up to two NZ elements at a time from each row of the first matrix and each column of the second matrix to a processing engine (PE) grid, each PE to multiply and accumulate matching NZ elements of the first and second matrices with corresponding elements of the third matrix. Each PE further to store an NZ element for use in a subsequent multiplications.
-
公开(公告)号:US11816483B2
公开(公告)日:2023-11-14
申请号:US15859268
申请日:2017-12-29
Applicant: Intel Corporation
Inventor: Raanan Sade , Simon Rubanovich , Amit Gradstein , Zeev Sperber , Alexander Heinecke , Robert Valentine , Mark J. Charney , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall , Menachem Adelman
CPC classification number: G06F9/30036 , G06F9/30101 , G06F17/16
Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address, and execution circuitry to execute the decoded instruction to store configuration information about usage of storage for two-dimensional data structures at the memory address.
-
公开(公告)号:US11809869B2
公开(公告)日:2023-11-07
申请号:US15858937
申请日:2017-12-29
Applicant: Intel Corporation
Inventor: Raanan Sade , Simon Rubanovich , Amit Gradstein , Zeev Sperber , Alexander Heinecke , Robert Valentine , Mark J. Charney , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall , Menachem Adelman
IPC: G06F9/30
CPC classification number: G06F9/30145 , G06F9/30036 , G06F9/30043
Abstract: Embodiments detailed herein relate to systems and methods to store a tile register pair to memory. In one example, a processor includes: decode circuitry to decode a store matrix pair instruction having fields for an opcode and source and destination identifiers to identify source and destination matrices, respectively, each matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded store matrix pair instruction to store every element of left and right tiles of the identified source matrix to corresponding element positions of left and right tiles of the identified destination matrix, respectively, wherein the executing stores a chunk of C elements of one row of the identified source matrix at a time.
-
公开(公告)号:US11789729B2
公开(公告)日:2023-10-17
申请号:US15858916
申请日:2017-12-29
Applicant: Intel Corporation
Inventor: Raanan Sade , Simon Rubanovich , Amit Gradstein , Zeev Sperber , Alexander Heinecke , Robert Valentine , Mark J. Charney , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall
CPC classification number: G06F9/3001 , G06F9/3005 , G06F9/3016 , G06F9/30036 , G06F9/30043 , G06F9/30076 , G06F9/30109 , G06F9/30123 , G06F9/30145 , G06F9/383 , G06F9/3824
Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (M,N) of the identified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the identified first source matrix by a corresponding nibble of a doubleword element (K,N) of the identified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element (M,N).
-
公开(公告)号:US11693785B2
公开(公告)日:2023-07-04
申请号:US16728527
申请日:2019-12-27
Applicant: Intel Corporation
Inventor: Ron Gabor , Enrico Perla , Raanan Sade , Igor Yanover , Tomer Stark , Joseph Nuzman
IPC: G06F12/00 , G06F12/0895 , G06F12/1081 , G06F12/1009 , G06F12/0811 , G06F12/14 , G06F9/30 , G06F11/30
CPC classification number: G06F12/0895 , G06F9/30043 , G06F9/30101 , G06F11/3037 , G06F12/0811 , G06F12/1009 , G06F12/1081 , G06F12/1441 , G06F12/1466 , G06F2212/7207
Abstract: An apparatus and method for tagged memory management. For example, one embodiment of a processor comprises: execution circuitry to execute instructions and process data, at least one instruction to generate a system memory access request having a first address pointer; and address translation circuitry to determine whether to translate the first address pointer with or without metadata processing, wherein if the first address pointer is to be translated with metadata processing, the address translation circuitry to: perform a lookup in a memory metadata table to identify a memory metadata value, determine a pointer metadata value associated with the first address pointer, and compare the memory metadata value with the pointer metadata value, the comparison to generate a validation of the memory access request or a fault condition, wherein if the comparison results in a validation of the memory access request, then accessing a set of one or more address translation tables to translate the first address pointer to a first physical address and to return the first physical address responsive to the memory access request.
-
69.
公开(公告)号:US11675590B2
公开(公告)日:2023-06-13
申请号:US17865849
申请日:2022-07-15
Applicant: Intel Corporation
Inventor: Raanan Sade , Robert Valentine , Bret Toll , Christopher J. Hughes , Alexander F. Heinecke , Elmoustapha Ould-Ahmed-Vall , Mark J. Charney
IPC: G06F12/128 , G06T1/00 , G06F9/30
CPC classification number: G06F9/30167 , G06F9/30101 , G06F9/30149
Abstract: Disclosed embodiments relate to systems and methods for performing instructions to transform matrices into a row-interleaved format. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction having fields to specify an opcode and locations of source and destination matrices, wherein the opcode indicates that the processor is to transform the specified source matrix into the specified destination matrix having the row-interleaved format; and execution circuitry to respond to the decoded instruction by transforming the specified source matrix into the specified RowInt-formatted destination matrix by interleaving J elements of each J-element sub-column of the specified source matrix in either row-major or column-major order into a K-wide submatrix of the specified destination matrix, the K-wide submatrix having K columns and enough rows to hold the J elements.
-
公开(公告)号:US11645135B2
公开(公告)日:2023-05-09
申请号:US17020663
申请日:2020-09-14
Applicant: Intel Corporation
Inventor: Tomer Stark , Ron Gabor , Joseph Nuzman , Raanan Sade , Bryant E. Bigbee
CPC classification number: G06F11/0751 , G06F9/38 , G06F11/073 , G06F12/00 , G06F12/109 , G06F21/60 , G06F12/145 , G06F2212/1032 , G06F2212/1052 , G06F2212/656
Abstract: Methods and apparatuses relating to memory corruption detection are described. In one embodiment, a hardware processor includes an execution unit to execute an instruction to request access to a block of a memory through a pointer to the block of the memory, and a memory management unit to allow access to the block of the memory when a memory corruption detection value in the pointer is validated with a memory corruption detection value in the memory for the block, wherein a position of the memory corruption detection value in the pointer is selectable between a first location and a second, different location.
-
-
-
-
-
-
-
-
-