HARDWARE INSTRUCTION SET TO REPLACE A PLURALITY OF ATOMIC OPERATIONS WITH A SINGLE ATOMIC OPERATION
    1.
    发明申请
    HARDWARE INSTRUCTION SET TO REPLACE A PLURALITY OF ATOMIC OPERATIONS WITH A SINGLE ATOMIC OPERATION 审中-公开
    硬件指令用一次原子能操作来代替多种原子操作

    公开(公告)号:US20160139934A1

    公开(公告)日:2016-05-19

    申请号:US14543027

    申请日:2014-11-17

    Abstract: Systems and methods may process a single atomic operation. An instruction set may be generated to replace a plurality of atomic operations with a single atomic operation. The instruction set may include an accumulation instruction to compute a prefix sum for a plurality of initial values associated with a plurality of processing lanes to generate a plurality of accumulated values. The instruction set may also include a broadcast instruction to return a pre-existing value to be added with each of the plurality of accumulated values to generate a plurality of intermediate accumulated values. In one example, a graphics processor may execute the instruction set to process the single atomic operation.

    Abstract translation: 系统和方法可以处理单个原子操作。 可以生成指令集以用单个原子操作来替换多个原子操作。 指令集可以包括用于计算与多个处理通道相关联的多个初始值的前缀和以产生多个累加值的累加指令。 指令集还可以包括广播指令,以返回要添加的多个累积值中的每一个的预先存在的值,以生成多个中间累加值。 在一个示例中,图形处理器可以执行指令集以处理单个原子操作。

    METHOD AND APPARATUS FOR SUBDIVIDING SHADER WORKLOADS IN A GRAPHICS PROCESSOR FOR EFFICIENT MACHINE CONFIGURATION

    公开(公告)号:US20190206110A1

    公开(公告)日:2019-07-04

    申请号:US15858396

    申请日:2017-12-29

    CPC classification number: G06T15/005 G06F9/38 G06T1/20 G06T2210/52

    Abstract: An apparatus and method for splitting shaders. For example, one embodiment of a method comprises: receiving a request for compilation of a shader in a graphics processing environment; determining whether there is sufficient work associated with the shader to justify splitting the shader into two or more blocks of program code; evaluating the program code of the shader to identify dependencies between the blocks of program code if there is sufficient work; subdividing the shader into the two or more blocks in accordance with the identified dependencies; and individually executing the two or more blocks of code on a graphics processor. In addition, one embodiment includes the operations of determining whether any of the regions that can be subdivided are likely to run faster with different machine configurations than if the shader is executed without being subdivided, and subdividing the shader only for those regions that are likely to run faster with different machine configurations.

Patent Agency Ranking