Techniques for optimizing stencil buffers

    公开(公告)号:US09098924B2

    公开(公告)日:2015-08-04

    申请号:US13942415

    申请日:2013-07-15

    CPC classification number: G06T1/60 B41F15/34 G06T11/40 G06T15/005

    Abstract: One embodiment sets forth a method for associating each stencil value included in a stencil buffer with multiple fragments. Components within a graphics processing pipeline use a set of stencil masks to partition the bits of each stencil value. Each stencil mask selects a different subset of bits, and each fragment is strategically associated with both a stencil value and a stencil mask. Before performing stencil actions associated with a fragment, the raster operations unit performs stencil mask operations on the operands. No fragments are associated with both the same stencil mask and the same stencil value. Consequently, no fragments are associated with the same stencil bits included in the stencil buffer. Advantageously, by reducing the number of stencil bits associated with each fragment, certain classes of software applications may reduce the wasted memory associated with stencil buffers in which each stencil value is associated with a single fragment.

    Dynamic partitioning of execution resources

    公开(公告)号:US11307903B2

    公开(公告)日:2022-04-19

    申请号:US15885751

    申请日:2018-01-31

    Abstract: Embodiments of the present invention set forth techniques for allocating execution resources to groups of threads within a graphics processing unit. A compute work distributor included in the graphics processing unit receives an indication from a process that a first group of threads is to be launched. The compute work distributor determines that a first subcontext associated with the process has at least one processor credit. In some embodiments, CTAs may be launched even when there are no processor credits, if one of the TPCs that was already acquired has sufficient space. The compute work distributor identifies a first processor included in a plurality of processors that has a processing load that is less than or equal to the processor loads associated with all other processors included in the plurality of processors. The compute work distributor launches the first group of threads to execute on the first processor.

    Techniques for interleaving surfaces
    15.
    发明授权
    Techniques for interleaving surfaces 有权
    交织表面的技术

    公开(公告)号:US09355430B2

    公开(公告)日:2016-05-31

    申请号:US14033389

    申请日:2013-09-20

    CPC classification number: G06T1/60

    Abstract: One embodiment sets forth a method for allocating memory to surfaces. A software application specifies surface data, including interleaving state data. Based on the interleaving state data, a surface access unit bloats addressees derived from discrete coordinates associated with the surface, creating a bloated virtual address space with a predictable pattern of addresses that do not correspond to data. Advantageously, by creating predictable regions of addresses that do not correspond to data, the software application program may configure the surface to share physical memory space with one or more other surfaces. In particular, the software application may map the virtual address space together with one or more virtual address spaces corresponding to complementary data patterns to the same physical base address. And, by overlapping the virtual address spaces onto the same pages in physical address space, the physical memory may be more densely packed than by using prior-art allocation techniques.

    Abstract translation: 一个实施例提出了一种用于将存储器分配给表面的方法。 软件应用程序指定表面数据,包括交错状态数据。 基于交错状态数据,表面访问单元使得与表面相关联的离散坐标导出的地址变得膨胀,从而产生具有与数据不对应的可预测地址模式的膨胀的虚拟地址空间。 有利地,通过创建不对应于数据的地址的可预测区域,软件应用程序可以配置表面以与一个或多个其他表面共享物理存储器空间。 特别地,软件应用程序可以将虚拟地址空间与对应于互补数据模式的一个或多个虚拟地址空间映射到相同的物理基址。 并且,通过将虚拟地址空间重叠到物理地址空间中的相同页面上,与使用现有技术的分配技术相比,物理存储器可能更加密集。

    Programmable blending via multiple pixel shader dispatches
    16.
    发明授权
    Programmable blending via multiple pixel shader dispatches 有权
    通过多个像素着色器调度进行可编程混合

    公开(公告)号:US09082212B2

    公开(公告)日:2015-07-14

    申请号:US13723972

    申请日:2012-12-21

    CPC classification number: G06T15/005

    Abstract: Techniques are disclosed for dispatching pixel information in a graphics processing pipeline. A fragment processing unit in the graphics processing pipeline generates a pixel that includes multiple samples based on a portion of a graphics primitive received by a thread. The fragment processing unit calculates a set of source values, where each source value corresponds to a different sample of the pixel. The fragment processing unit retrieves a set of destination values from a render target, where each destination value corresponds to a different source value. The fragment processing unit blends each source value with a corresponding destination value to create a set of final values, and creates one or more dispatch messages to store the set of final values in a set of output registers. One advantage of the disclosed techniques is that pixel shader programs perform per-sample operations with increased efficiency.

    Abstract translation: 公开了用于在图形处理流水线中调度像素信息的技术。 图形处理流水线中的片段处理单元基于由线程接收的图形原语的一部分生成包括多个样本的像素。 片段处理单元计算一组源值,其中每个源值对应于像素的不同样本。 片段处理单元从渲染目标检索一组目的地值,其中每个目的地值对应于不同的源值。 片段处理单元将每个源值与相应的目的地值相混合以创建一组最终值,并且创建一个或多个调度消息以将一组最终值存储在一组输出寄存器中。 所公开的技术的一个优点是像素着色器程序以更高的效率执行每个样本操作。

Patent Agency Ranking