-
公开(公告)号:US20220100513A1
公开(公告)日:2022-03-31
申请号:US17134085
申请日:2020-12-24
Applicant: Intel Corporation
Inventor: CHRISTOPHER J. HUGHES , ALEXANDER HEINECKE , ROBERT VALENTINE , MENACHEM ADELMAN , EVANGELOS GEORGANAS , MARK CHARNEY
Abstract: Systems, methods, and apparatuses relating to one or more instructions that load data into a tile register and pad a row (or column) with a pad value from a padding circuit are described. In one embodiment, a system includes a matrix operations accelerator circuit comprising a two-dimensional grid of processing elements, a tile register that represents a two-dimensional matrix coupled to the matrix operations accelerator circuit, and a coupling to a memory, a padding circuit coupled to the tile register, and a hardware processor core including a decoder, of the hardware processor core coupled to the matrix operations accelerator circuit, to decode a single instruction into a decoded single instruction, the single instruction comprising a first field that identifies the tile register, a second field that identifies data elements in the memory, and an opcode, the opcode to indicate an execution circuit of the hardware processor core is to cause a load of the data elements from the memory into the tile register and the padding circuit to pad a proper subset of elements of the tile register with a same value, and the execution circuit of the hardware processor core to execute the decoded single instruction according to the opcode.
-
公开(公告)号:US20200097291A1
公开(公告)日:2020-03-26
申请号:US16140196
申请日:2018-09-24
Applicant: Intel Corporation
Inventor: CHRISTOPHER J. HUGHES , BRET TOLL , ALEXANDER HEINECKE , DAN BAUM , ELMOUSTAPHA OULD-AHMED-VALL , RAANAN SADE , ROBERT VALENTINE , MARK CHARNEY
Abstract: An apparatus and method for tile-based gather and scatter operations. For example, one embodiment of a processor comprises: a destination tile register to store a 2-D arrangement of data elements; a first source tile register to store indices associated with the data elements; instruction fetch circuitry to fetch a tile gather instruction comprising operands identifying the first source tile register and the destination tile register; a decoder to decode the tile gather instruction; and execution circuitry to determine a plurality of system memory addresses based on the indices from the first source tile register and to load the data elements from the system memory addresses to the destination tile register.
-
公开(公告)号:US20190227797A1
公开(公告)日:2019-07-25
申请号:US15879420
申请日:2018-01-24
Applicant: Intel Corporation
Inventor: ALEXANDER HEINECKE , DIPANKAR DAS , ROBERT VALENTINE , MARK CHARNEY
Abstract: An apparatus and method for performing multiply-accumulate operations. For example, one embodiment of a processor comprises: a decoder to decode instructions; a first source register to store a first plurality of packed words; a second source register to store a second plurality of packed words; a third source register to store a plurality of packed quadwords; execution circuitry to execute a first instruction, the execution circuitry comprising: extension circuitry to sign-extend or zero-extend the first and second plurality of packed words to generate a first and second plurality of doublewords corresponding to the first and second plurality of packed words; multiplier circuitry to multiply each of the first plurality of doublewords with a corresponding one of the second plurality of doublewords to generate a plurality of temporary products; adder circuitry to add at least a first set of the temporary products to generate a first temporary sum; accumulation circuitry to combine the first temporary sum with a first packed quadword value from a first quadword location in the third source register to generate a first accumulated quadword result; a destination register to store the first accumulated quadword result in the first quadword location.
-
-