UTILIZING PIPELINE REGISTERS AS INTERMEDIATE STORAGE
    41.
    发明申请
    UTILIZING PIPELINE REGISTERS AS INTERMEDIATE STORAGE 有权
    使用管道注册器作为中间存储

    公开(公告)号:US20150324196A1

    公开(公告)日:2015-11-12

    申请号:US14275047

    申请日:2014-05-12

    Abstract: In one example, a method includes responsive to receiving, by a processing unit, one or more instructions requesting that a first value be moved from a first general purpose register (GPR) to a third GPR and that a second value be moved from a second GPR to a fourth GPR, copying, by an initial logic unit and during a first clock cycle, the first value to an initial pipeline register, copying, by the initial logic and during a second clock cycle, the second value to the initial pipeline register, copying, by a final logic unit and during a third clock cycle, the first value from a final pipeline register to the third GPR, and copying, by the final logic unit and during a fourth clock cycle, the second value from the final pipeline register to the fourth GPR.

    Abstract translation: 在一个示例中,一种方法包括响应于由处理单元接收一个或多个请求将第一值从第一通用寄存器(GPR)移动到第三GPR的指令,并且第二值从第二个 GPR到第四个GPR,由初始逻辑单元和在第一时钟周期期间将第一个值复制到初始流水线寄存器,通过初始逻辑复制第二个时钟周期,将第二个值复制到初始流水线寄存器 ,由最终逻辑单元和在第三时钟周期期间将第一值从最终流水线寄存器复制到第三GPR,并且由最终逻辑单元复制并在第四时钟周期期间从最终管道复制第二值 注册到第四个GPR。

    TECHNIQUES FOR SERIALIZED EXECUTION IN A SIMD PROCESSING SYSTEM
    42.
    发明申请
    TECHNIQUES FOR SERIALIZED EXECUTION IN A SIMD PROCESSING SYSTEM 审中-公开
    SIMD处理系统中串行执行的技术

    公开(公告)号:US20150317157A1

    公开(公告)日:2015-11-05

    申请号:US14268215

    申请日:2014-05-02

    CPC classification number: G06F9/3851 G06F9/3887

    Abstract: A SIMD processor may be configured to determine one or more active threads from a plurality of threads, select one active thread from the one or more active threads, and perform a divergent operation on the selected active thread. The divergent operation may be a serial operation.

    Abstract translation: SIMD处理器可以被配置为从多个线程确定一个或多个活动线程,从一个或多个活动线程中选择一个活动线程,并对所选择的活动线程执行发散操作。 发散操作可以是串行操作。

    Dynamic wave pairing
    44.
    发明授权

    公开(公告)号:US11954758B2

    公开(公告)日:2024-04-09

    申请号:US17652478

    申请日:2022-02-24

    CPC classification number: G06T1/20 G06F9/505

    Abstract: This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for dynamic wave pairing. A graphics processor may allocate one or more GPU workloads to one or more wave slots of a plurality of wave slots. The graphics processor may select a first execution slot of a plurality of execution slots for executing the one or more GPU workloads. The selection may be based on one of a plurality of granularities. The graphics processor may execute, at the selected first execution slot, the one or more GPU workloads at the one of the plurality of granularities.

    GPU wave-to-wave optimization
    45.
    发明授权

    公开(公告)号:US11928754B2

    公开(公告)日:2024-03-12

    申请号:US17658433

    申请日:2022-04-07

    CPC classification number: G06T1/20 G06T15/005 G06T15/80

    Abstract: This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for GPU wave-to-wave optimization. A graphics processor may execute a shader program for a first wave associated with a draw call or a compute kernel. The graphics processor may identify at least one first indication for the first wave associated with the draw call or the compute kernel. The graphics processor may store the at least one first indication for the first wave to a memory location. The graphics processor may execute the shader program for at least one second wave associated with the draw call or the compute kernel. The execution of the shader program for the at least one second wave may be based on the shader program for the at least one second wave reading the memory location to retrieve the at least one first indication.

    Bin filtering
    48.
    发明授权

    公开(公告)号:US11600002B2

    公开(公告)日:2023-03-07

    申请号:US16892096

    申请日:2020-06-03

    Abstract: Methods, systems, and devices for graphics processing are described. A device may receive an image including a set of pixels. The device may render a first subset of pixels in each bin of a set of bins during a first rendering pass, and defer rendering a second subset of pixels and a third subset of pixels in each bin of the set of bins during the first rendering pass. The second subset of pixels may include edge pixels and the third subset of pixels may be between the first subset of pixels and the second subset of pixels. The device may render the second subset of pixels and the third subset of pixels in each bin of the set of bins during a second rendering pass based on rendering the first subset of pixels. The device may then output the image based on the first and second rendering pass.

Patent Agency Ranking