Parallel array architecture for a graphics processor
    51.
    发明授权
    Parallel array architecture for a graphics processor 有权
    用于图形处理器的并行阵列架构

    公开(公告)号:US08730249B2

    公开(公告)日:2014-05-20

    申请号:US13269462

    申请日:2011-10-07

    摘要: A parallel array architecture for a graphics processor includes a multithreaded core array including a plurality of processing clusters, each processing cluster including at least one processing core operable to execute a pixel shader program that generates pixel data from coverage data; a rasterizer configured to generate coverage data for each of a plurality of pixels; and pixel distribution logic configured to deliver the coverage data from the rasterizer to one of the processing clusters in the multithreaded core array. A crossbar coupled to each of the processing clusters is configured to deliver pixel data from the processing clusters to a frame buffer having a plurality of partitions.

    摘要翻译: 用于图形处理器的并行阵列架构包括包括多个处理簇的多线程核心阵列,每个处理簇包括至少一个可操作以执行从覆盖数据生成像素数据的像素着色器程序的处理核心; 光栅化器,被配置为生成多个像素中的每一个的覆盖数据; 以及像素分布逻辑,被配置为将覆盖数据从光栅化器传送到多线程核心阵列中的处理集群之一。 耦合到每个处理集群的交叉开关被配置为将像素数据从处理集群传送到具有多个分区的帧缓冲器。

    Alpha-to-coverage value determination using virtual samples
    52.
    发明授权
    Alpha-to-coverage value determination using virtual samples 有权
    使用虚拟样本的Alpha到覆盖值确定

    公开(公告)号:US08669999B2

    公开(公告)日:2014-03-11

    申请号:US12904935

    申请日:2010-10-14

    IPC分类号: G09G5/00

    摘要: One embodiment of the present invention sets forth a technique for converting alpha values into pixel coverage masks. Geometric coverage is sampled at a number of “real” sample positions within each pixel. Color and depth values are computed for each of these real samples. Fragment alpha values are used to determine an alpha coverage mask for the real samples and additional “virtual” samples, in which the number of bits set in the mask bits is proportional to the alpha value. An alpha-to-coverage mode uses the virtual samples to increase the number of transparency levels for each pixel compared with using only real samples. The alpha-to-coverage mode may be used in conjunction with virtual coverage anti-aliasing to provide higher-quality transparency for rendering anti-aliased images.

    摘要翻译: 本发明的一个实施例提出了一种将α值转换为像素覆盖掩码的技术。 在每个像素内的多个“实”样本位置采样几何覆盖。 为这些实际样本中的每一个计算颜色和深度值。 片段α值用于确定实际样本和附加“虚拟”样本的alpha覆盖掩码,其中掩码位中设置的位数与alpha值成比例。 与仅使用真实样本相比,alpha到覆盖模式使用虚拟样本来增加每个像素的透明度级别数。 alpha到覆盖模式可以与虚拟覆盖抗锯齿一起使用,以提供用于渲染抗锯齿图像的更高质量的透明度。

    Threshold-based lossy reduction color compression
    53.
    发明授权
    Threshold-based lossy reduction color compression 有权
    基于阈值的有损减少颜色压缩

    公开(公告)号:US08605104B1

    公开(公告)日:2013-12-10

    申请号:US12651357

    申请日:2009-12-31

    摘要: One embodiment of the present invention sets forth a technique for compressing color data. Color data for a tile including multiple samples is compressed based on an equality comparison and a threshold comparison based on a programmable threshold value. The equality comparison is performed on a first portion of the color data that includes at least exponent and sign fields of floating point format values or high order bits of integer format values. The threshold comparison is performed on a second portion of the color data that includes mantissa fields of floating point format values or low order bits of integer format values. The equality comparison and threshold comparison are used to select either computed averages of the pixel components or the original color data as the output color data for the tile. When the threshold is set to zero, only tiles that can be compressed without loss are compressed.

    摘要翻译: 本发明的一个实施例提出了一种用于压缩颜色数据的技术。 基于等效比较和基于可编程阈值的阈值比较来压缩包括多个样本的瓦片的颜色数据。 对颜色数据的至少包括浮点格式值的指数和符号字段或整数格式值的高位的字段的第一部分执行相等比较。 在包括浮点格式值的尾数字段或整数格式值的低阶位的颜色数据的第二部分上执行阈值比较。 使用相等比较和阈值比较来选择像素分量的计算平均值或原始颜色数据作为瓦片的输出颜色数据。 当阈值设置为零时,只有压缩而不损失的图块才会被压缩。

    Partial coverage layers for color compression
    54.
    发明授权
    Partial coverage layers for color compression 有权
    用于颜色压缩的部分覆盖层

    公开(公告)号:US08488890B1

    公开(公告)日:2013-07-16

    申请号:US12813912

    申请日:2010-06-11

    IPC分类号: G06K9/46

    CPC分类号: H04N19/96

    摘要: One embodiment of the present invention sets forth a technique for compressing image data with high contrast between pixels within a tile and between samples within pixels without any data loss. Partial coverage layers are generated and written to a tile that includes multiple pixels without reading the existing image data that is stored for the tile. A partial coverage layer encodes image data, such as colors, and sub-pixel coverage information for each covered pixel in a tile. The use of partial coverage layers reduces the bandwidth used to store image data when a tile is not fully covered.

    摘要翻译: 本发明的一个实施例提出了一种用于在瓦片内的像素之间以及像素内的样本之间以高对比度压缩图像数据而不具有任何数据丢失的技术。 生成部分覆盖层并将其写入包含多个像素的图块,而不读取为该图块存储的现有图像数据。 部分覆盖层对瓦片中的每个被覆盖像素的图像数据(例如颜色)和子像素覆盖信息进行编码。 当瓦片未被完全覆盖时,部分覆盖层的使用降低了用于存储图像数据的带宽。

    Efficient memory translator with variable size cache line coverage
    55.
    发明授权
    Efficient memory translator with variable size cache line coverage 有权
    高效的内存翻译器,具有可变大小的缓存线路覆盖

    公开(公告)号:US08341380B2

    公开(公告)日:2012-12-25

    申请号:US12851483

    申请日:2010-08-05

    IPC分类号: G06F12/00 G06F13/00

    摘要: One embodiment of the present invention sets forth a system and method for supporting high-throughput virtual to physical address translation using compressed TLB cache lines with variable address range coverage. The amount of memory covered by a TLB cache line depends on the page size and page table entry (PTE) compression level. When a TLB miss occurs, a cache line is allocated with an assumed address range that may be larger or smaller than the address range of the PTE data actually returned. Subsequent requests that hit a cache line with a fill pending are queued until the fill completes. When the fill completes, the cache line's address range is set to the address range of the PTE data returned. Queued requests are replayed and any that fall outside the actual address range are reissued, potentially generating additional misses and fills.

    摘要翻译: 本发明的一个实施例阐述了一种使用具有可变地址范围覆盖的压缩TLB高速缓存行来支持高吞吐量虚拟到物理地址转换的系统和方法。 TLB缓存行覆盖的内存量取决于页面大小和页表项(PTE)压缩级别。 当发生TLB未命中时,分配具有可能大于或小于实际返回的PTE数据的地址范围的假定地址范围的高速缓存行。 后续请求命中一个填充待处理的缓存行排队等待填充完成。 当填充完成时,缓存行的地址范围设置为返回的PTE数据的地址范围。 排队的请求被重播,任何落在实际地址范围之外的任何地址将被重新发布,潜在地产生额外的未命中和填充。

    Z-test result reconciliation with multiple partitions
    56.
    发明授权
    Z-test result reconciliation with multiple partitions 有权
    Z检验结果与多个分区进行协调

    公开(公告)号:US08232991B1

    公开(公告)日:2012-07-31

    申请号:US11934042

    申请日:2007-11-01

    IPC分类号: G06T15/40 G09G5/36 G09G5/37

    CPC分类号: G06T15/40

    摘要: The current invention involves new systems and methods for computing per-sample post-z test coverage when the memory is organized in multiple partitions that may not match the number of shaders. Shaded pixels output by the shaders can be processed by one of several z raster operations units. The shading processing capability can be configured independent of the number of memory partitions and number of z raster operations units. The current invention also involves new systems and method for using different z test modes with multiple render targets with a single or multiple memory partitions. Rendering performance may be improved by using an early z testing mode is used to eliminate non-visible samples prior to shading.

    摘要翻译: 当本发明涉及当将存储器组织在可能不匹配着色器数量的多个分区中时,用于计算每个样本后z测试覆盖的新系统和方法。 着色器输出的阴影像素可以由几个z光栅操作单元之一处理。 可以独立于存储器分区的数量和z光栅操作单元的数量来配置着色处理能力。 本发明还涉及使用具有单个或多个存储器分区的具有多个渲染目标的不同z测试模式的新系统和方法。 渲染性能可以通过使用早期z测试模式来改善,用于在阴影之前消除不可见样本。

    Early Z testing for multiple render targets
    57.
    发明授权
    Early Z testing for multiple render targets 有权
    早期Z测试为多个渲染目标

    公开(公告)号:US08228328B1

    公开(公告)日:2012-07-24

    申请号:US11934046

    申请日:2007-11-01

    IPC分类号: G06T15/40 G09G5/36 G09G5/37

    CPC分类号: G06T15/40

    摘要: The current invention involves new systems and methods for computing per-sample post-z test coverage when the memory is organized in multiple partitions that may not match the number of shaders. Shaded pixels output by the shaders can be processed by one of several z raster operations units. The shading processing capability can be configured independent of the number of memory partitions and number of z raster operations units. The current invention also involves new systems and method for using different z test modes with multiple render targets with a single or multiple memory partitions. Rendering performance may be improved by using an early z testing mode is used to eliminate non-visible samples prior to shading.

    摘要翻译: 当本发明涉及当将存储器组织在可能不匹配着色器数量的多个分区中时,用于计算每个样本后z测试覆盖的新系统和方法。 着色器输出的阴影像素可以由几个z光栅操作单元之一处理。 可以独立于存储器分区的数量和z光栅操作单元的数量来配置着色处理能力。 本发明还涉及使用具有单个或多个存储器分区的具有多个渲染目标的不同z测试模式的新系统和方法。 渲染性能可以通过使用早期z测试模式来改善,用于在阴影之前消除不可见样本。

    Graphics rendering pipeline that supports early-Z and late-Z virtual machines
    58.
    发明授权
    Graphics rendering pipeline that supports early-Z and late-Z virtual machines 有权
    支持早期Z和后期Z虚拟机的图形渲染管道

    公开(公告)号:US08207975B1

    公开(公告)日:2012-06-26

    申请号:US11959441

    申请日:2007-12-18

    IPC分类号: G06T1/20 G06T15/50 G06T15/60

    CPC分类号: G06T15/005 G06T15/405

    摘要: One embodiment of the present invention sets forth a graphics pipeline architecture for optimizing graphics rendering efficiency by advancing the Z-test operation prior to shading operations whenever possible, as determined by an upstream pipeline configuration unit. Each processing engine within the graphics pipeline maintains independent state for both early Z-mode and late Z-mode operations and also may maintain state common to both modes. The processing engines receive work transactions that include a Z-mode flag indicating whether the work transaction should be processed in late Z-mode or early Z-mode. The Z-mode flag is also used to dynamically route any resulting outbound data, so that the appropriate data flow for either early Z or late Z processing is dynamically constructed for each work transaction. The shader engine is advantageously relieved of unnecessary work whenever possible by discarding occluded samples whose z-values are not altered by shading operations before they enter the shader engine.

    摘要翻译: 本发明的一个实施例阐述了一种图形流水线架构,用于通过在由上游流水线配置单元确定的可能的情况下,在阴影操作之前推进Z测试操作来优化图形渲染效率。 图形管线内的每个处理引擎都保持早期Z模式和后期Z模式操作的独立状态,并且还可以保持两种模式共同的状态。 处理引擎接收包括Z模式标志的工作事务,该Z模式标志指示工作事务是否应在后期Z模式或早期Z模式下处理。 Z模式标志还用于动态路由任何导出的出站数据,以便为每个工作事务动态构建早期Z或后期Z处理的适当数据流。 着色引擎有利地可以通过丢弃在进入着色引擎之前通过阴影操作来改变z值的遮挡样本,尽可能地减轻不必要的工作。

    System and method for packing data in different formats in a tiled graphics memory
    59.
    发明授权
    System and method for packing data in different formats in a tiled graphics memory 有权
    在平铺图形存储器中以不同格式打包数据的系统和方法

    公开(公告)号:US08059131B1

    公开(公告)日:2011-11-15

    申请号:US12175706

    申请日:2008-07-18

    IPC分类号: G06F12/02 G06T15/40 G09G5/39

    摘要: A tiled graphics memory permits graphics data to be stored in different tile formats. One application is selecting a tile format optimized for the data generated for particular graphical surfaces in different rendering modes. Consequently, the tile format can be selected to optimize memory access efficiency and/or packing efficiency. In one embodiment a first tile format stores pixel data in a format storing two different types of pixel data whereas a second tile format stores one type of pixel data. In one implementation, a z-only tile format is provided to store only z data but no stencil data. At least one other tile format is provided to store both z data and stencil data. In one implementation, z data and stencil data are stored in different portions of a tile to facilitate separate memory accesses of z and stencil data.

    摘要翻译: 平铺的图形存储器允许以不同的瓦片格式存储图形数据。 一个应用是选择针对在不同渲染模式下针对特定图形表面生成的数据进行优化的瓦片格式。 因此,可以选择瓦片格式以优化存储器访问效率和/或打包效率。 在一个实施例中,第一瓦片格式以存储两种不同类型的像素数据的格式存储像素数据,而第二瓦片格式存储一种类型的像素数据。 在一个实施方式中,仅提供z仅瓦片格式以仅存储z数据而不存在模板数据。 提供至少另一个瓦片格式以存储z数据和模板数据。 在一个实现中,z数据和模板数据存储在瓦片的不同部分中,以便于z和模板数据的单独的存储器访问。

    Methods and systems for reusing memory addresses in a graphics system
    60.
    发明授权
    Methods and systems for reusing memory addresses in a graphics system 有权
    在图形系统中重复使用存储器地址的方法和系统

    公开(公告)号:US07944452B1

    公开(公告)日:2011-05-17

    申请号:US11552093

    申请日:2006-10-23

    IPC分类号: G06F12/02 G06F12/10 G06F12/06

    摘要: Methods and systems for reusing memory addresses in a graphics system are disclosed, so that instances of address translation hardware can be reduced. One embodiment of the present invention sets forth a method, which includes mapping a footprint in screen space to a group of contiguous physical memory locations in a memory system, determining a first physical memory address for a first transaction associated with the footprint, wherein the first physical memory address is within the group of contiguous physical memory locations, determining a second transaction that is also associated with the footprint, determining a set of least significant bits associated with the second transaction, and combining a portion of the first physical memory address with the set of least significant bits associated with the second transaction to generate a second physical memory address for the second transaction, thereby avoiding a second full address translation.

    摘要翻译: 公开了用于重新使用图形系统中的存储器地址的方法和系统,从而可以减少地址转换硬件的实例。 本发明的一个实施例提出了一种方法,其包括将屏幕空间中的占位面积映射到存储器系统中的一组连续的物理存储器位置,确定与所述覆盖区相关联的第一事务的第一物理存储器地址,其中所述第一 物理存储器地址在连续的物理存储器位置组内,确定也与占用空间相关联的第二事务,确定与第二事务相关联的一组最低有效位,以及将第一物理存储器地址的一部分与 与第二事务相关联的一组最低有效位以产生用于第二事务的第二物理存储器地址,从而避免第二完整地址转换。