Smart performance of spill fill data transfers in computing environments

    公开(公告)号:US10956359B2

    公开(公告)日:2021-03-23

    申请号:US15678592

    申请日:2017-08-16

    Abstract: A mechanism is described for facilitating smart spill/fill data transfers in computing environments. A method of embodiments, as described herein, includes facilitating dividing a kernel into regions including low pressure regions and high pressure regions, where the low pressure regions are associated with low use of registers hosted by a processor of a computing device, while the high pressure regions are associated with high use of the registers. The method may further include transferring of data between memory and the registers based on at least one of the low pressure regions and the high pressure regions.

    Apparatus and method for widened SIMD execution within a constrained register file

    公开(公告)号:US11029960B2

    公开(公告)日:2021-06-08

    申请号:US16214012

    申请日:2018-12-07

    Abstract: Apparatus and method for widened SIMD execution on a limited register file. For example, one embodiment of an apparatus comprises: instruction dispatch circuitry to dispatch instructions of a thread for execution, including a first instruction to indicate a start of a double execution instruction sequence and a second instruction to indicate an end of a double execution instruction sequence; and execution circuitry including single instruction multiple data (SIMD) circuitry, the execution circuitry to execute the double execution instruction sequence in a first pass using a first set of lanes of the SIMD circuitry and to execute the double execution instruction sequence in a second pass following the first pass using a second set of lanes of the SIMD circuitry.

    Apparatus and method for efficiently accessing memory when performing a horizontal data reduction

    公开(公告)号:US10409571B1

    公开(公告)日:2019-09-10

    申请号:US15922833

    申请日:2018-03-15

    Inventor: Marek Targowski

    Abstract: Apparatus and method for optimizing shader execution. For example, one embodiment of a graphics processing apparatus comprises: a plurality of execution units to execute shader programs; optimization detection circuitry and/or logic to identify one or more portions of shader program code to be optimized including one or more reduction operations which require read/write memory operations and associated barrier operations; and optimization circuitry and/or logic to optimize the shader program code by converting a plurality of the read/write memory operations to read/write register operations and removing one or more barrier operations to generate optimized shader program code; the execution units to execute the optimized shader program code.

Patent Agency Ranking