Parallel array architecture for a graphics processor
    1.
    发明授权
    Parallel array architecture for a graphics processor 有权
    用于图形处理器的并行阵列架构

    公开(公告)号:US08730249B2

    公开(公告)日:2014-05-20

    申请号:US13269462

    申请日:2011-10-07

    摘要: A parallel array architecture for a graphics processor includes a multithreaded core array including a plurality of processing clusters, each processing cluster including at least one processing core operable to execute a pixel shader program that generates pixel data from coverage data; a rasterizer configured to generate coverage data for each of a plurality of pixels; and pixel distribution logic configured to deliver the coverage data from the rasterizer to one of the processing clusters in the multithreaded core array. A crossbar coupled to each of the processing clusters is configured to deliver pixel data from the processing clusters to a frame buffer having a plurality of partitions.

    摘要翻译: 用于图形处理器的并行阵列架构包括包括多个处理簇的多线程核心阵列,每个处理簇包括至少一个可操作以执行从覆盖数据生成像素数据的像素着色器程序的处理核心; 光栅化器,被配置为生成多个像素中的每一个的覆盖数据; 以及像素分布逻辑,被配置为将覆盖数据从光栅化器传送到多线程核心阵列中的处理集群之一。 耦合到每个处理集群的交叉开关被配置为将像素数据从处理集群传送到具有多个分区的帧缓冲器。

    Screen compression
    2.
    发明授权
    Screen compression 有权
    屏幕压缩

    公开(公告)号:US07342590B1

    公开(公告)日:2008-03-11

    申请号:US10435073

    申请日:2003-05-09

    IPC分类号: G06T9/00 G06K9/36

    摘要: Methods, circuits, and apparatus for reducing memory bandwidth used by a graphics processor. Uncompressed tiles are read from a display buffer portion of a graphics memory and received by an encoder. The uncompressed tiles are compressed and written back to the graphics memory. When a tile is needed again before it has been modified, the compressed version is read from memory, uncompressed, and displayed. To reduce the number of unnecessary writes of compressed tiles to memory, a tile is only written to memory if it has remained static for some number of refresh cycles. Also, to prevent a large number of compressed tiles being written to the display buffer in one refresh cycle, the encoder can be throttled after a number of tiles have been written. Validity information can be stored for use by a CRTC. If a tile is updated, the validity information is updated such that invalid compressed data is not read from memory and displayed.

    摘要翻译: 用于减少由图形处理器使用的存储器带宽的方法,电路和装置。 未压缩的瓦片从图形存储器的显示缓冲器部分读取并由编码器接收。 未压缩的瓦片被压缩并写回图形存储器。 在修改瓦片之前,再次需要一个瓦片时,从内存中读取压缩版本,解压缩并显示。 为了将压缩瓦片的不必要的写入数量减少到存储器,如果在一些刷新周期内保持静态,则瓦片仅写入存储器。 此外,为了防止在一个刷新周期中将大量的压缩瓦片写入显示缓冲器,编码器可以在写入多个瓦片之后被节流。 有效信息可以存储供CRTC使用。 如果更新瓦片,则更新有效性信息,使得无法从存储器读取无效的压缩数据并显示。

    Method and system for improving data coherency in a parallel rendering system
    4.
    发明授权
    Method and system for improving data coherency in a parallel rendering system 有权
    用于提高并行渲染系统中数据一致性的方法和系统

    公开(公告)号:US08379033B2

    公开(公告)日:2013-02-19

    申请号:US13399458

    申请日:2012-02-17

    IPC分类号: G06F15/80

    摘要: A method and system for improving data coherency in a parallel rendering system is disclosed. Specifically, one embodiment of the present invention sets forth a method for managing a plurality of independently processed texture streams in a parallel rendering system that includes the steps of maintaining a time stamp for a group of tiles of work that are associated with each of the plurality of the texture streams and are associated with a specified area in screen space, and utilizing the time stamps to counter divergences in the independent processing of the plurality of texture streams.

    摘要翻译: 公开了一种用于提高并行渲染系统中数据一致性的方法和系统。 具体地,本发明的一个实施例阐述了一种用于在并行渲染系统中管理多个独立处理的纹理流的方法,该方法包括以下步骤:维护与多个相关联的工作的一组瓦片的时间戳 的纹理流并且与屏幕空间中的指定区域相关联,并且利用时间戳来反复在多个纹理流的独立处理中的分歧。

    Interprocessor direct cache writes
    5.
    发明授权
    Interprocessor direct cache writes 有权
    处理器直接缓存写入

    公开(公告)号:US08327071B1

    公开(公告)日:2012-12-04

    申请号:US11939528

    申请日:2007-11-13

    IPC分类号: G06F12/00

    CPC分类号: G06F12/0815

    摘要: In a multiprocessor system level 2 caches are positioned on the memory side of a routing crossbar rather than on the processor side of the routing crossbar. This configuration permits the processors to store messages directly into each other's caches rather than into system memory or their own coherent caches. Therefore, inter-processor communication latency is reduced.

    摘要翻译: 在多处理器系统级别,2个缓存位于路由交叉开关的存储器侧,而不是位于路由交叉开关的处理器侧。 这种配置允许处理器将消息直接存储在彼此的高速缓存中,而不是存储到系统存储器中或者它们自己的连贯缓存中。 因此,处理器间通信延迟降低。

    Multiple simultaneous context architecture for rebalancing contexts on multithreaded processing cores upon a context change
    6.
    发明授权
    Multiple simultaneous context architecture for rebalancing contexts on multithreaded processing cores upon a context change 有权
    多个并发上下文体系结构,用于在上下文更改时重新平衡多线程处理核心上的上下文

    公开(公告)号:US08095782B1

    公开(公告)日:2012-01-10

    申请号:US11763371

    申请日:2007-06-14

    CPC分类号: G06F9/461 G06F9/5088

    摘要: Graphics processing elements are capable of processing multiple contexts simultaneously, reducing the need to perform time consuming context switches compared with processing a single context at a time. Processing elements of a graphics processing pipeline may be configured to support all of the multiple contexts or only a portion of the multiple contexts. Each processing element may be allocated to process a particular context or a portion of the multiple contexts in order to simultaneously process more than one context. The allocation of processing elements to the multiple contexts may be determined dynamically in order to improve graphics processing throughput.

    摘要翻译: 与一次处理单个上下文相比,图形处理元件能够同时处理多个上下文,减少了执行耗时的上下文切换的需要。 图形处理流水线的处理元件可以被配置为支持多个上下文中的所有或上述多个上下文的一部分。 可以分配每个处理元件以处理特定上下文或多个上下文的一部分,以便同时处理多于一个上下文。 可以动态地确定处理元件到多个上下文的分配,以便提高图形处理吞吐量。

    Occlusion culling method and apparatus for graphics systems
    7.
    发明授权
    Occlusion culling method and apparatus for graphics systems 有权
    闭塞剔除方法和图形系统的装置

    公开(公告)号:US06894689B1

    公开(公告)日:2005-05-17

    申请号:US10658171

    申请日:2003-09-08

    IPC分类号: G06T15/00 G06T15/40

    CPC分类号: G06T15/405 G06T15/005

    摘要: A system, method and computer program product are provided for avoiding reading z-values in a graphics pipeline. Initially, near z-values are stored which are each representative of a near z-value on an object in a region. Such region is defined by a tile and a coverage mask therein. Thereafter, the stored near z-values are compared with far z-values computed for other objects in the region. Such comparison indicates whether an object is visible in the region. Based on the comparison, z-values previously stored for image samples in the region are conditionally read from memory.

    摘要翻译: 提供了一种用于避免在图形管线中读取z值的系统,方法和计算机程序产品。 最初,存储近似的z值,它们各自表示区域中的对象上的近z值。 这样的区域由其中的瓦片和覆盖掩模定义。 此后,将存储的近z值与对该区域中的其他对象计算的远z值进行比较。 这样的比较指示对象是否在该区域中可见。 基于该比较,先前为该区域中的图像样本存储的z值有条件地从存储器读取。

    Modified method and apparatus for improved occlusion culling in graphics systems
    8.
    发明授权
    Modified method and apparatus for improved occlusion culling in graphics systems 有权
    用于改进图形系统中遮挡剔除的改进方法和装置

    公开(公告)号:US06646639B1

    公开(公告)日:2003-11-11

    申请号:US09885665

    申请日:2001-06-19

    IPC分类号: G06T1500

    CPC分类号: G06T15/405 G06T15/005

    摘要: A system, method and computer program product are provided for avoiding reading z-values in a graphics pipeline. Initially, near z-values are stored which are each representative of a near z-value on an object in a region. Such region is defined by a tile and a coverage mask therein. Thereafter, the stored near z-values are compared with far z-values computed for other objects in the region. Such comparison indicates whether an object is visible in the region. Based on the comparison, z-values previously stored for image samples in the region are conditionally read from memory.

    摘要翻译: 提供了一种用于避免在图形管线中读取z值的系统,方法和计算机程序产品。 最初,存储近似的z值,它们各自表示区域中的对象上的近z值。 这样的区域由其中的瓦片和覆盖掩模限定。 此后,将存储的近z值与对该区域中的其他对象计算的远z值进行比较。 这样的比较指示对象是否在该区域中可见。 基于该比较,先前为该区域中的图像样本存储的z值有条件地从存储器读取。

    System and method for structuring an A-buffer to support multi-sample anti-aliasing
    9.
    发明授权
    System and method for structuring an A-buffer to support multi-sample anti-aliasing 有权
    用于构造A缓冲区以支持多样本抗锯齿的系统和方法

    公开(公告)号:US08553041B1

    公开(公告)日:2013-10-08

    申请号:US12208211

    申请日:2008-09-10

    申请人: John M. Danskin

    发明人: John M. Danskin

    IPC分类号: G06T1/60

    摘要: One embodiment of the present invention sets forth a technique for efficiently creating and accessing an A-Buffer that supports multi-sample compression techniques. The A-Buffer is organized in stacks of uniformly-sized tiles, wherein the tile size is selected to facilitate compression techniques. Each stack represents the samples included in a group of pixels. Each tile within a stack represents the set of sample data at a specific per-sample rendering order index that are associated with the group of pixels represented by the stack. Advantageously, each tile includes tile compression bits that enable the tile to maintain data using existing compression formats. As the A-Buffer is created, a corresponding stack compression buffer is also created. For each stack, the stack compression buffer includes a bit that indicates whether all of the tiles in the stack are similarly compressed and, consequently, whether the GPU may operate on the stack at an efficient per pixel granularity.

    摘要翻译: 本发明的一个实施例提出了一种用于有效地创建和访问支持多样本压缩技术的A缓冲器的技术。 A缓冲器以均匀尺寸的瓦片的堆叠组织,其中选择瓦片尺寸以便于压缩技术。 每个堆栈表示包含在一组像素中的样本。 堆栈中的每个瓦片表示与由堆栈表示的像素组相关联的特定每样本渲染顺序索引处的采样数据集合。 有利地,每个瓦片包括瓦片压缩比特,使得瓦片能够使用现有的压缩格式来维护数据。 当创建A缓冲区时,也会创建相应的堆栈压缩缓冲区。 对于每个堆栈,堆栈压缩缓冲器包括指示堆栈中的所有瓦片是否被类似地压缩的位,并且因此,GPU是否可以以每像素粒度有效地在堆栈上操作。

    METHOD AND SYSTEM FOR IMPROVING DATA COHERENCY IN A PARALLEL RENDERING SYSTEM
    10.
    发明申请
    METHOD AND SYSTEM FOR IMPROVING DATA COHERENCY IN A PARALLEL RENDERING SYSTEM 有权
    用于提高并行渲染系统中的数据相关性的方法和系统

    公开(公告)号:US20120147027A1

    公开(公告)日:2012-06-14

    申请号:US13399458

    申请日:2012-02-17

    IPC分类号: G09G5/00

    摘要: A method and system for improving data coherency in a parallel rendering system is disclosed. Specifically, one embodiment of the present invention sets forth a method for managing a plurality of independently processed texture streams in a parallel rendering system that includes the steps of maintaining a time stamp for a group of tiles of work that are associated with each of the plurality of the texture streams and are associated with a specified area in screen space, and utilizing the time stamps to counter divergences in the independent processing of the plurality of texture streams.

    摘要翻译: 公开了一种用于提高并行渲染系统中数据一致性的方法和系统。 具体地,本发明的一个实施例阐述了一种用于在并行渲染系统中管理多个独立处理的纹理流的方法,该方法包括以下步骤:维护与多个相关联的工作的一组瓦片的时间戳 的纹理流并且与屏幕空间中的指定区域相关联,并且利用时间戳来反复在多个纹理流的独立处理中的分歧。