-
公开(公告)号:US20170242699A1
公开(公告)日:2017-08-24
申请号:US15452479
申请日:2017-03-07
Applicant: Intel Corporation
Inventor: PAUL CAPRIOLI , ABHAY S. KANHERE , JEFFREY J. COOK , MUAWYA M. AL-OTOOM
CPC classification number: G06F9/30036 , G06F9/30014 , G06F9/30032 , G06F9/3012 , G06F9/3887 , G06F9/3893
Abstract: A vector reduction instruction is executed by a processor to provide efficient reduction operations on an array of data elements. The processor includes vector registers. Each vector register is divided into a plurality of lanes, and each lane stores the same number of data elements. The processor also includes execution circuitry that receives the vector reduction instruction to reduce the array of data elements stored in a source operand into a result in a destination operand using a reduction operator. Each of the source operand and the destination operand is one of the vector registers. Responsive to the vector reduction instruction, the execution circuitry applies the reduction operator to two of the data elements in each lane, and shifts one or more remaining data elements when there is at least one of the data elements remaining in each lane.
-
公开(公告)号:US20180060049A1
公开(公告)日:2018-03-01
申请号:US15615798
申请日:2017-06-06
Applicant: Intel Corporation
Inventor: DAVID J. SAGER , RUCHIRA SASANKA , RON GABOR , SHLOMO RAIKIN , JOSEPH NUZMAN , LEEOR PELED , JASON A. DOMER , HO-SEOP KIM , YOUFENG WU , KOICHI YAMADA , TIN-FOOK NGAI , HOWARD H. CHEN , JAYARAM BOBBA , JEFFREY J. COOK , OMAR M. SHAIKH , SURESH SRINIVAS
Abstract: Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program into multiple parallel threads are described. In some embodiments, the systems and apparatuses execute a method of original code decomposition and/or generated thread execution.
-