-
公开(公告)号:US20200249949A1
公开(公告)日:2020-08-06
申请号:US16487766
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Robert VALENTINE , Menachem ADELMAN , Milind B. GIRKAR , Zeev SPERBER , Mark J. CHARNEY , Bret L. TOLL , Rinat RAPPOPORT , Jesus Corbal , Stanislav SHWARTSMAN , Dan BAUM , Igor YANOVER , Alexander F. HEINECKE , Barukh ZIV , Elmoustapha OULD-AHMED-VALL , Yuri GEBIL
Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in the form of decode circuitry to decode an instruction having fields for an opcode, a destination matrix operand identifier, and source memory information, and execution circuitry to execute the decoded instruction to load groups of strided data elements from memory into configured rows of the identified destination matrix operand to memory.
-
公开(公告)号:US20200249947A1
公开(公告)日:2020-08-06
申请号:US16487774
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Robert VALENTINE , Zeev SPERBER , Mark J. CHARNEY , Bret L. TOLL , Jesus CORBAL , Alexander HEINECKE , Barukh ZIV , Dan BAUM , Elmoustapha OULD-AHMED-VALL , Stanislav SHWARTSMAN
Abstract: Embodiments detailed herein relate to matrix operations. In particular, embodiment of broadcasting elements are described. For example, some embodiments describe broadcasting a scalar to all configured data element positons of a destination matrix (tile). For example, some embodiments describe broadcasting a row to all configured data element positons of a destination matrix (tile). For example, some embodiments describe broadcasting a column to all configured data element positons of a destination matrix (tile).
-
公开(公告)号:US20200241877A1
公开(公告)日:2020-07-30
申请号:US16487777
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Menachem ADELMAN , Robert VALENTINE , Zeev SPERBER , Mark J. CHARNEY , Bret L. TOLL , Rinat RAPPOPORT , Jesus CORBAL , Dan BAUM , Alexander F. HEINECKE , Elmoustapha OULD-AHMED-VALL , Yuri GEBIL , Raanan SADE
IPC: G06F9/30
Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.
-
24.
公开(公告)号:US20190303142A1
公开(公告)日:2019-10-03
申请号:US15941531
申请日:2018-03-30
Applicant: Intel Corporation
Inventor: Raanan SADE , Thierry PONS , Amit GRADSTEIN , Zeev SPERBER , Mark J. CHARNEY , Robert VALENTINE , Eyal Oz-Sinay
Abstract: Disclosed embodiments relate to efficient complex vector multiplication. In one example, an apparatus includes execution circuitry, responsive to an instruction having fields to specify multiplier, multiplicand, and summand complex vectors, to perform two operations: first, to generate a double-even multiplicand by duplicating even elements of the specified multiplicand, and to generate a temporary vector using a fused multiply-add (FMA) circuit having A, B, and C inputs set to the specified multiplier, the double-even multiplicand, and the specified summand, respectively, and second, to generate a double-odd multiplicand by duplicating odd elements of the specified multiplicand, to generate a swapped multiplier by swapping even and odd elements of the specified multiplier, and to generate a result using a second FMA circuit having its even product negated, and having A, B, and C inputs set to the swapped multiplier, the double-odd multiplicand, and the temporary vector, respectively.
-
公开(公告)号:US20190286444A1
公开(公告)日:2019-09-19
申请号:US16184994
申请日:2018-11-08
Applicant: INTEL CORPORATION
Inventor: Rajiv KAPOOR , Ronen ZOHAR , Mark J. BUXTON , Zeev SPERBER , Koby GOTTLIEB
Abstract: A method and apparatus for including in processor instructions for performing logical-comparison and branch support operations on packed or unpacked data. In one embodiment, instruction decode logic decodes instructions for an execution unit to operate on packed data elements including logical comparisons. A register file including 128-bit packed data registers stores packed single-precision floating point (SPFP) and packed integer data elements. The logical comparisons may include comparison of SPFP data elements and comparison of integer data elements and setting at least one bit to indicate the results. Based on these comparisons, branch support actions are taken. Such branch support actions may include setting the at least one bit, which in turn may be utilized by a branching unit in response to a branch instruction. Alternatively, the branch support actions may include branching to an indicated target code location.
-
公开(公告)号:US20190286441A1
公开(公告)日:2019-09-19
申请号:US16271675
申请日:2019-02-08
Applicant: INTEL CORPORATION
Inventor: Seth ABRAHAM , Elmoustapha OULD-AHMED-VALL , Robert VALENTINE , Zeev SPERBER , Amit GRADSTEIN
Abstract: A method of an aspect includes receiving an instruction. The instruction indicates an integer stride, indicates an integer offset, and indicates a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four integers in numerical order with a smallest one of the at least four integers differing from zero by the integer offset and with all integers of the sequence in consecutive positions differing by the integer stride. Other methods, apparatus, systems, and instructions are disclosed.
-
公开(公告)号:US20240111533A1
公开(公告)日:2024-04-04
申请号:US18534012
申请日:2023-12-08
Applicant: Intel Corporation
Inventor: Menachem ADELMAN , Robert VALENTINE , Zeev SPERBER , Mark J. CHARNEY , Bret L. TOLL , Rinat RAPPOPORT , Jesus CORBAL , Dan BAUM , Alexander F. HEINECKE , Elmoustaha OULD-AHMED-VALL , Yuri GEBIL , Raanan SADE
CPC classification number: G06F9/30036 , G06F7/485 , G06F7/4876 , G06F7/762 , G06F9/3001 , G06F9/30032 , G06F9/30043 , G06F9/30109 , G06F9/30112 , G06F9/30134 , G06F9/30145 , G06F9/30149 , G06F9/3016 , G06F9/30185 , G06F9/30196 , G06F9/3818 , G06F9/3836 , G06F17/16 , G06F2212/454
Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.
-
公开(公告)号:US20230069000A1
公开(公告)日:2023-03-02
申请号:US17463398
申请日:2021-08-31
Applicant: Intel Corporation
Inventor: Alexander HEINECKE , Menachem ADELMAN , Robert VALENTINE , Zeev SPERBER , Amit GRADSTEIN , Mark CHARNEY , Evangelos GEORGANAS , Dhiraj KALAMKAR , Christopher HUGHES , Cristina ANDERSON
IPC: G06F9/30
Abstract: Techniques for performing arithmetic operations on BF16 values are described. An exemplary instruction includes fields for an opcode, an identification of a location of a first packed data source operand, an identification of a location of a second packed data source operand, and an identification of location of a packed data destination operand, wherein the opcode is to indicate an arithmetic operation execution circuitry is to perform, for each data element position of the identified packed data source operands, the arithmetic operation on BF16 data elements in that data element position in BF16 format and store a result of each arithmetic operation into a corresponding data element position of the identified packed data destination operand.
-
公开(公告)号:US20230060146A1
公开(公告)日:2023-03-02
申请号:US17463390
申请日:2021-08-31
Applicant: Intel Corporation
Inventor: Menachem ADELMAN , Alexander HEINECKE , Robert VALENTINE , Zeev SPERBER , Amit GRADSTEIN , Mark CHARNEY , Evangelos GEORGANAS , Dhiraj KALAMKAR , Christopher HUGHES , Cristina ANDERSON
IPC: G06F9/30
Abstract: Techniques for BF16 classification or manipulation using single instructions are described. An exemplary instruction includes fields for an opcode, an identification of a location of a packed data source operand, an indication of one or more classification checks to perform, and an identification of a packed data destination operand, wherein the opcode is to indicate that execution circuitry is to perform, for each data element position of the packed data source operand, a classification according to the indicated one or more classification checks and store a result of the classification in a corresponding data element position of the destination operand.
-
30.
公开(公告)号:US20220326949A1
公开(公告)日:2022-10-13
申请号:US17845103
申请日:2022-06-21
Applicant: Intel Corporation
Inventor: Alexander F. HEINECKE , Robert VALENTINE , Mark J. CHARNEY , Raanan SADE , Menachem ADELMAN , Zeev SPERBER , Amit GRADSTEIN , Simon RUBANOVICH
Abstract: Disclosed embodiments relate to systems and methods for performing 16-bit floating-point vector dot product instructions. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first source, second source, and destination vectors, the opcode to indicate execution circuitry is to multiply N pairs of 16-bit floating-point formatted elements of the specified first and second sources, and accumulate the resulting products with previous contents of a corresponding single-precision element of the specified destination, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.
-
-
-
-
-
-
-
-
-