Method and apparatus for subdividing shader workloads in a graphics processor for efficient machine configuration

    公开(公告)号:US10360717B1

    公开(公告)日:2019-07-23

    申请号:US15858396

    申请日:2017-12-29

    申请人: Intel Corporation

    IPC分类号: G06F9/38 G06T1/20 G06T15/00

    摘要: An apparatus and method for splitting shaders. For example, one embodiment of a method comprises: receiving a request for compilation of a shader in a graphics processing environment; determining whether there is sufficient work associated with the shader to justify splitting the shader into two or more blocks of program code; evaluating the program code of the shader to identify dependencies between the blocks of program code if there is sufficient work; subdividing the shader into the two or more blocks in accordance with the identified dependencies; and individually executing the two or more blocks of code on a graphics processor. In addition, one embodiment includes the operations of determining whether any of the regions that can be subdivided are likely to run faster with different machine configurations than if the shader is executed without being subdivided, and subdividing the shader only for those regions that are likely to run faster with different machine configurations.

    Techniques to manage execution of divergent shaders

    公开(公告)号:US11776195B2

    公开(公告)日:2023-10-03

    申请号:US17463320

    申请日:2021-08-31

    申请人: Intel Corporation

    IPC分类号: G06T15/00 G06F9/48

    CPC分类号: G06T15/005 G06F9/4887

    摘要: Examples are described here that can be used to enable a main routine to request subroutines or other related code to be executed with other instantiations of the same subroutine or other related code for parallel execution. A sorting unit can be used to accumulate requests to execute instantiations of the subroutine. The sorting unit can request execution of a number of multiple instantiations of the subroutine corresponding to a number of lanes in a SIMD unit. A call stack can be used to share information to be accessed by a main routine after execution of the subroutine completes.

    Techniques to manage execution of divergent shaders

    公开(公告)号:US11107263B2

    公开(公告)日:2021-08-31

    申请号:US16190021

    申请日:2018-11-13

    申请人: Intel Corporation

    IPC分类号: G06T15/00 G06F9/48

    摘要: Examples are described here that can be used to enable a main routine to request subroutines or other related code to be executed with other instantiations of the same subroutine or other related code for parallel execution. A sorting unit can be used to accumulate requests to execute instantiations of the subroutine. The sorting unit can request execution of a number of multiple instantiations of the subroutine corresponding to a number of lanes in a SIMD unit. A call stack can be used to share information to be accessed by a main routine after execution of the subroutine completes.

    Apparatus and method for a programmable depth stencil graphics pipeline stage

    公开(公告)号:US10573055B2

    公开(公告)日:2020-02-25

    申请号:US15693084

    申请日:2017-08-31

    申请人: Intel Corporation

    摘要: An apparatus and method for programmable depth stencil pipeline stage and shading. For example, one embodiment of a graphics processing apparatus comprises: a rasterizer to generate a plurality of pixel blocks, one or more of which overlap one or more primitives; programmable depth stencil circuitry to perform depth stencil tests on the pixels which overlap the one or more primitives to identify pixels which pass the depth stencil tests; and thread dispatch circuitry to dispatch pixel shader threads to perform pixel shading operations on those pixels which pass the depth stencil tests, the thread dispatch circuitry including thread dispatch recombine logic to combine pixels which have passed the depth stencil test from multiple pixel blocks into a set of pixel shader threads to be executed concurrently on single instruction multiple data (SIMD) hardware.