-
公开(公告)号:EP3702940A1
公开(公告)日:2020-09-02
申请号:EP20152535.9
申请日:2020-01-17
申请人: INTEL Corporation
IPC分类号: G06F17/16
摘要: The present disclosure is directed to systems and methods for decomposing systolic array circuitry to provide a plurality of N x N systolic sub-array circuits, apportioning a first tensor or array into a plurality of N x M first input arrays, and apportioning a second tensor or array into a plurality of M x N second input arrays. Systolic array control circuitry transfers corresponding ones of the first input arrays and second input arrays to a respective one of the plurality of N x N systolic sub-array circuits. As the elements included in the first input array and the elements included in the second input array are transferred to the systolic sub-array, the systolic sub-array performs one or more mathematical operations using the first and the second input arrays. The systems and methods beneficially improve the usage of the systolic array circuitry thereby advantageously reducing the number of clock cycles needed to perform a given number of calculations.
-
公开(公告)号:EP2965194A1
公开(公告)日:2016-01-13
申请号:EP13877084.7
申请日:2013-03-05
申请人: Intel Corporation
发明人: SASANKA, Ruchira , COOK, Jeffrey J. , DAS, Abhinav , BOBBA, Jayaram , GREENFIELD, Michael R. , SRINIVAS, Suresh
摘要: Embodiments of computer-implemented methods, systems, computing devices, and computer-readable media (transitory and non-transitory) are described herein for analyzing execution of a plurality of executable instructions and, based on the analysis, providing an indication of a benefit to be obtained by vectorization of at least a subset of the plurality of executable instructions. In various embodiments, the analysis may include identification of the subset of the plurality of executable instructions suitable for conversion to one or more single-instruction multiple-data ("SIMD") instructions.
-
公开(公告)号:EP3506108A1
公开(公告)日:2019-07-03
申请号:EP18209352.6
申请日:2018-11-29
申请人: Intel Corporation
发明人: RAY, Joydeep , ASHBAUGH, Ben , SURTI, Prasoonkumar , RAMANI, Pradeep , HARIHARA, Rama , JUSTIN, Jerin C. , HUANG, Jing , CUI, Xiaoming , COSTA, Timothy B. , GONG, Ting , OULD-AHMED-VALL, Elmoustapha , BALASUBRAMANIAN, Kumar , THOMAS, Anil , ELIBOL, Oguz H. , BOBBA, Jayaram , ZHUANG, Guozhong , SUBRAMANIAN, Bhavani , KESKIN, Gokce , SAKTHIVEL, Chandrasekaran , POORNACHANDRAN, Rajesh
IPC分类号: G06F12/02
摘要: Embodiments are generally directed to compression in machine learning and deep learning processing. An embodiment of an apparatus for compression of untyped data includes a graphical processing unit (GPU) including a data compression pipeline, the data compression pipeline including a data port coupled with one or more shader cores, wherein the data port is to allow transfer of untyped data without format conversion, and a 3D compression/decompression unit to provide for compression of untyped data to be stored to a memory subsystem and decompression of untyped data from the memory subsystem.
-
-