-
公开(公告)号:US20200097291A1
公开(公告)日:2020-03-26
申请号:US16140196
申请日:2018-09-24
Applicant: Intel Corporation
Inventor: CHRISTOPHER J. HUGHES , BRET TOLL , ALEXANDER HEINECKE , DAN BAUM , ELMOUSTAPHA OULD-AHMED-VALL , RAANAN SADE , ROBERT VALENTINE , MARK CHARNEY
Abstract: An apparatus and method for tile-based gather and scatter operations. For example, one embodiment of a processor comprises: a destination tile register to store a 2-D arrangement of data elements; a first source tile register to store indices associated with the data elements; instruction fetch circuitry to fetch a tile gather instruction comprising operands identifying the first source tile register and the destination tile register; a decoder to decode the tile gather instruction; and execution circuitry to determine a plurality of system memory addresses based on the indices from the first source tile register and to load the data elements from the system memory addresses to the destination tile register.
-
公开(公告)号:US20190196828A1
公开(公告)日:2019-06-27
申请号:US15850248
申请日:2017-12-21
Applicant: Intel Corporation
Inventor: VENKATESWARA MADDURI , CARL MURRAY , ELMOUSTAPHA OULD-AHMED-VALL , MARK CHARNEY , ROBERT VALENTINE , JESUS CORBAL , MILIND GIRKAR , BRET TOLL
CPC classification number: G06F9/30145 , G06F9/30101 , G06F17/16
Abstract: An apparatus and method for performing signed fractional multiplication of packed data elements. For example one embodiment of a processor comprises: a decoder to decode an instruction; a first source register to store a first plurality of packed signed word data elements; a second source register to store a second plurality of packed signed word data elements; a control register to store a rounding control value to indicate a rounding mode; execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to concurrently multiply each of the packed signed word data elements of the first plurality with a corresponding packed signed word data element of the second plurality to generate a plurality of signed doubleword products; conversion circuitry to convert the plurality of signed doubleword products to a plurality of fractional signed words, the conversion circuitry including rounding circuitry to round the signed doubleword products in accordance with the rounding mode indicated by the rounding control value to generate the plurality of fractional signed words; and a destination register to store the plurality of fractional signed words as packed signed word fractional data elements in specified data element positions within the destination register.
-