-
公开(公告)号:US12236242B2
公开(公告)日:2025-02-25
申请号:US18186710
申请日:2023-03-20
Applicant: Intel Corporation
Inventor: Raanan Sade , Simon Rubanovich , Amit Gradstein , Zeev Sperber , Alexander Heinecke , Robert Valentine , Mark J. Charney , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall , Menachem Adelman
Abstract: Embodiments detailed herein relate to systems and methods to load a tile register pair. In one example, a processor includes: decode circuitry to decode a load matrix pair instruction having fields for an opcode and source and destination identifiers to identify source and destination matrices, respectively, each matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded load matrix pair instruction to load every element of left and right tiles of the identified destination matrix from corresponding element positions of left and right tiles of the identified source matrix, respectively, wherein the executing operates on one row of the identified destination matrix at a time, starting with the first row.
-
公开(公告)号:US12198186B2
公开(公告)日:2025-01-14
申请号:US17401575
申请日:2021-08-13
Applicant: Intel Corporation
Inventor: Andrew Herdrich , Edwin Verplanke , Ravishankar Iyer , Christopher Gianos , Jeffrey D. Chamberlain , Ronak Singhal , Julius Mandelblat , Bret Toll
IPC: G06Q40/03 , G06F12/0875 , G06F12/0897
Abstract: Systems, methods, and apparatuses for resource bandwidth monitoring and control are described. For example, in some embodiments, an apparatus comprising a requestor device to send a credit based request, a receiver device to receive and consume the credit based request, and a delay element in a return path between the requestor and receiver devices, the delay element to delay a credit based response from the receiver to the requestor are detailed.
-
公开(公告)号:US12182568B2
公开(公告)日:2024-12-31
申请号:US18449651
申请日:2023-08-14
Applicant: Intel Corporation
Inventor: Raanan Sade , Simon Rubanovich , Amit Gradstein , Zeev Sperber , Alexander Heinecke , Robert Valentine , Mark J. Charney , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall
Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (M,N) of the identified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the identified first source matrix by a corresponding nibble of a doubleword element (K,N) of the identified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element (M,N).
-
公开(公告)号:US12175246B2
公开(公告)日:2024-12-24
申请号:US18460497
申请日:2023-09-01
Applicant: Intel Corporation
Inventor: Dan Baum , Michael Espig , James Guilford , Wajdi K. Feghali , Raanan Sade , Christopher J. Hughes , Robert Valentine , Bret Toll , Elmoustapha Ould-Ahmed-Vall , Mark J. Charney , Vinodh Gopal , Ronen Zohar , Alexander F. Heinecke
Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.
-
公开(公告)号:US11954489B2
公开(公告)日:2024-04-09
申请号:US17549363
申请日:2021-12-13
Applicant: Intel Corporation
Inventor: Bret Toll , Christopher J. Hughes , Dan Baum , Elmoustapha Ould-Ahmed-Vall , Raanan Sade , Robert Valentine , Mark J. Charney , Alexander F. Heinecke
IPC: G06F9/30
CPC classification number: G06F9/30145 , G06F9/30032 , G06F9/30036 , G06F9/30109
Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.
-
公开(公告)号:US20230385059A1
公开(公告)日:2023-11-30
申请号:US18449651
申请日:2023-08-14
Applicant: Intel Corporation
Inventor: Raanan Sade , Simon Rubanovich , Amit Gradstein , Zeev Sperber , Alexander Heinecke , Robert Valentine , Mark J. Charney , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall
CPC classification number: G06F9/3001 , G06F9/30145 , G06F9/3005 , G06F9/30036 , G06F9/383 , G06F9/3016 , G06F9/30109 , G06F9/30123 , G06F9/30076 , G06F9/3824 , G06F9/30043
Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (M,N) of the identified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the identified first source matrix by a corresponding nibble of a doubleword element (K,N) of the identified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element (M,N).
-
公开(公告)号:US11714648B2
公开(公告)日:2023-08-01
申请号:US17549221
申请日:2021-12-13
Applicant: Intel Corporation
Inventor: Bret Toll , Christopher J. Hughes , Dan Baum , Elmoustapha Ould-Ahmed-Vall , Raanan Sade , Robert Valentine , Mark J. Charney , Alexander F. Heinecke
IPC: G06F9/30
CPC classification number: G06F9/30145 , G06F9/30032 , G06F9/30036 , G06F9/30109
Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.
-
公开(公告)号:US20220019438A1
公开(公告)日:2022-01-20
申请号:US17335377
申请日:2021-06-01
Applicant: Intel Corporation
Inventor: Raanan Sade , Simon Rubanovich , Amit Gradstein , Zeev Sperber , Alexander Heinecke , Robert Valentine , Mark J. Charney , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall , Menachem Adelman , Eyal Hadas
IPC: G06F9/30
Abstract: Embodiments detailed herein relate to systems and methods to zero a tile register pair. In one example, a processor includes decode circuitry to decode a matrix pair zeroing instruction having fields for an opcode and an identifier to identify a destination matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded matrix pair zeroing instruction to zero every element of a left matrix and a right matrix of the identified destination matrix.
-
公开(公告)号:US11036499B2
公开(公告)日:2021-06-15
申请号:US16613537
申请日:2017-06-30
Applicant: Intel Corporation
Inventor: Venkateswara R. Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Mark J. Charney , Carl Murray , Milind Girkar , Bret Toll
Abstract: Embodiments of systems, apparatuses, and methods for performing controllable sine and/or cosine operations in a processor are described. For example, execution circuitry executes a decoded instruction to compute at least a real output value and an imaginary output value based on at least a cosine calculation and a sine calculation, the cosine and sine calculations each based on an index value from a packed data source operand, add the index value with an index increment value from the packed data source operand to create an updated index value, and store the real output value, the imaginary output value, and the updated index value to a packed data destination operand.
-
公开(公告)号:US11023235B2
公开(公告)日:2021-06-01
申请号:US15858947
申请日:2017-12-29
Applicant: Intel Corporation
Inventor: Raanan Sade , Simon Rubanovich , Amit Gradstein , Zeev Sperber , Alexander Heinecke , Robert Valentine , Mark J. Charney , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall , Menachem Adelman , Eyal Hadas
IPC: G06F9/30
Abstract: Embodiments detailed herein relate to systems and methods to zero a tile register pair. In one example, a processor includes decode circuitry to decode a matrix pair zeroing instruction having fields for an opcode and an identifier to identify a destination matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded matrix pair zeroing instruction to zero every element of a left matrix and a right matrix of the identified destination matrix.
-
-
-
-
-
-
-
-
-