-
公开(公告)号:US20210406018A1
公开(公告)日:2021-12-30
申请号:US16914347
申请日:2020-06-27
Applicant: INTEL CORPORATION
Inventor: Menachem Adelman , Robert Valentine , Barukh Ziv , Yaroslav Pollak , Gideon Stupp , Amit Gradstein , Simon Rubanovich , Zeev Sperber , Mark Charney , Christopher Hughes , Alexander Heinecke
Abstract: Systems, methods, and apparatuses relating to one or more instructions that utilize direct paths for loading data into a tile from a vector register and/or storing data from a tile into a vector register are described. In one embodiment, a system includes a matrix operations accelerator circuit comprising a two-dimensional grid of processing elements, a plurality of registers that represents a two-dimensional matrix coupled to the two-dimensional grid of processing elements, and a coupling to a cache; and a hardware processor core comprising: a vector register, a decoder to decode a single instruction into a decoded single instruction, the single instruction including a first field that identifies the two-dimensional matrix, a second field that identifies a set of elements of the two-dimensional matrix, and a third field that identifies the vector register, and an execution circuit to execute the decoded single instruction to cause a store of the set of elements from the plurality of registers that represents the two-dimensional matrix into the vector register by a coupling of the hardware processor core to the matrix operations accelerator circuit that is separate from the coupling to the cache.
-
公开(公告)号:US20210406012A1
公开(公告)日:2021-12-30
申请号:US16914317
申请日:2020-06-27
Applicant: Intel Corporation
Inventor: Menachem Adelman , Robert Valentine , Gideon Stupp , Yaroslav Pollak , Amit Gradstein , Simon Rubanovich , Zeev Sperber , Mark J. Charney , Christopher J. Hughes , Alexander F. Heinecke , Evangelos Georganas
Abstract: Embodiments for loading and storing matrix data with datatype conversion are disclosed. In an embodiment, a processor includes a decoder and execution circuitry. The decoder is to decode an instruction having a format including an opcode field to specify an opcode, a first destination operand field to specify a first destination matrix location, and a first source operand field to specify a first source matrix location. The execution circuitry is to, in response to the decoded instruction, convert data elements from a plurality of source element locations of a first source matrix specified by the first source matrix location from a first datatype to a second datatype to generate a plurality of converted data elements and to store each of the plurality of converted data elements in one of a plurality of destination element locations in a first destination matrix specified by the first destination matrix location.
-