Method and system for distributing work batches to processing units based on a number of enabled streaming multiprocessors
    11.
    发明授权
    Method and system for distributing work batches to processing units based on a number of enabled streaming multiprocessors 有权
    基于多个启用的流式多处理器将工作批次分配到处理单元的方法和系统

    公开(公告)号:US09594599B1

    公开(公告)日:2017-03-14

    申请号:US12579143

    申请日:2009-10-14

    摘要: A work distribution unit distributes work batches to general processing clusters (GPCs) based on the number of streaming multiprocessors included in each GPC. Advantageously, each GPC receives an amount of work that is proportional to the amount of processing power afforded by the GPC. Embodiments include a method for distributing batches of processing tasks to two or more general processing clusters (GPCs), including the steps of updating a counter value for each of the two or more GPCs based on the number of enabled parallel processing units within each of the two or more GPCs, and distributing a batch of processing tasks to a first GPC of the two or more GPCs based on a counter value associated with the first GPC and based on a load signal received from the first GPC.

    摘要翻译: 工作分配单元基于每个GPC中包括的流式多处理器的数量将工作批次分发到通用处理集群(GPC)。 有利地,每个GPC接收与由GPC提供的处理能力的量成比例的一定量的工作。 实施例包括用于将批处理任务批处理分配给两个或更多个通用处理集群(GPC)的方法,包括以下步骤:基于每个所述两个或更多个GPC内的所启用的并行处理单元的数目来更新两个或更多个GPC中的每一个的计数器值 两个或更多个GPC,并且基于与第一GPC相关联的计数器值并基于从第一GPC接收的负载信号将一批处理任务分配到两个或更多个GPC的第一GPC。

    TIME SLICE PROCESSING OF TESSELLATION AND GEOMETRY SHADERS
    12.
    发明申请
    TIME SLICE PROCESSING OF TESSELLATION AND GEOMETRY SHADERS 有权
    测量和几何学家的时间片加工

    公开(公告)号:US20130038620A1

    公开(公告)日:2013-02-14

    申请号:US13208256

    申请日:2011-08-11

    IPC分类号: G09G5/00 G06T1/00

    CPC分类号: G06T1/20

    摘要: One embodiment of the present invention sets forth a technique for redistributing geometric primitives generated by tessellation and geometry shaders for processing by multiple graphics pipelines. Geometric primitives that are generated in a first processing cycle are collected and redistributed more evenly and in smaller tasks to the multiple graphics pipelines for vertex processing in a second processing cycle. The smaller tasks do not exceed the resource limits of a graphics pipeline and the per-vertex processing workloads of the graphics pipelines in the second cycle are balanced and make full use of resources. Therefore, the performance of the tessellation and geometry shaders is improved.

    摘要翻译: 本发明的一个实施例提出了一种用于重新分配由镶嵌和几何着色器生成的几何图元以用于由多个图形管线进行处理的技术。 在第一个处理周期中生成的几何图元在第二个处理周期中被收集并且更均匀地并且在更小的任务中重新分布到用于顶点处理的多个图形流水线。 较小的任务不超过图形管道的资源限制,并且第二周期中的图形管道的每顶点处理工作负载平衡并充分利用资源。 因此,纹理和几何着色器的性能得到改善。

    Distributing primitives to multiple rasterizers
    13.
    发明授权
    Distributing primitives to multiple rasterizers 有权
    将原语分发到多个光栅化器

    公开(公告)号:US09536341B1

    公开(公告)日:2017-01-03

    申请号:US12581746

    申请日:2009-10-19

    IPC分类号: G06F15/80 G06T15/00

    CPC分类号: G06T15/005 G06T2210/52

    摘要: One embodiment of the present invention sets forth a technique for parallel distribution of primitives to multiple rasterizers. Multiple, independent geometry units perform geometry processing concurrently on different graphics primitives. A primitive distribution scheme delivers primitives from the multiple geometry units concurrently to multiple rasterizers at rates of multiple primitives per clock. The multiple, independent rasterizer units perform rasterization concurrently on one or more graphics primitives, enabling the rendering of multiple primitives per system clock.

    摘要翻译: 本发明的一个实施例提出了一种用于将原语并行分配到多个光栅化器的技术。 多个独立的几何单元在不同的图形基元上同时执行几何处理。 原始分配方案以每个时钟的多个基元的速率将原始图元从多个几何单元同时传送到多个光栅化器。 多个独立的光栅化器单元在一个或多个图形基元上同时执行光栅化,使得能够每个系统时钟渲染多个基元。

    Hardware-managed virtual buffers using a shared memory for load distribution
    14.
    发明授权
    Hardware-managed virtual buffers using a shared memory for load distribution 有权
    使用共享内存进行硬件管理的虚拟缓冲区进行负载分配

    公开(公告)号:US08760460B1

    公开(公告)日:2014-06-24

    申请号:US12773712

    申请日:2010-05-04

    IPC分类号: G06T1/60

    CPC分类号: G06T1/60

    摘要: One embodiment of the present invention sets forth a technique for using a shared memory to store hardware-managed virtual buffers. A circular buffer is allocated within a general-purpose multi-use cache for storage of primitive attribute data rather than having a dedicated buffer for the storage of the primitive attribute data. The general-purpose multi-use cache is also configured to store other graphics data sinces the space requirement for primitive attribute data storage is highly variable, depending on the number of attributes and the size of primitives. Entries in the circular buffer are allocated as needed and released and invalidated after the primitive attribute data has been consumed. An address to the circular buffer entry is transmitted along with primitive descriptors from object-space processing to the distributed processing in screen-space.

    摘要翻译: 本发明的一个实施例提出了一种使用共享存储器来存储硬件管理的虚拟缓冲器的技术。 在通用多用途高速缓存中分配循环缓冲器以存储原始属性数据,而不是具有用于存储原始属性数据的专用缓冲器。 通用多用途缓存还被配置为存储其他图形数据,对于原始属性数据存储的空间要求是高度可变的,这取决于属性的数量和图元的大小。 循环缓冲区中的条目根据需要进行分配,并在原始属性数据被消耗后被释放和无效。 循环缓冲区条目的地址与原始描述符一起从对象空间处理传输到屏幕空间中的分布式处理。

    Method and apparatus for transporting information to a graphic accelerator card
    18.
    发明授权
    Method and apparatus for transporting information to a graphic accelerator card 有权
    将信息传送到图形加速卡的方法和装置

    公开(公告)号:US06313845B1

    公开(公告)日:2001-11-06

    申请号:US09345678

    申请日:1999-06-30

    IPC分类号: G06F1300

    CPC分类号: G06T17/00 G06F9/3879 G06T1/20

    摘要: A graphics request stream is transferred from a host processor to a graphics card via a host bus so that the stream traverses the host bus no more than once. To that end, the graphics card has a graphics card memory, and the host processor has a host memory configured in a first memory configuration. The graphics card memory may be configured in the first memory configuration, and the graphics request stream is received directly in a message from the host processor (via the host bus). Upon receipt by the graphics card, the graphics request stream is written to the graphics card memory.

    摘要翻译: 图形请求流通过主机总线从主机处理器传送到图形卡,使得数据流不超过一次地遍历主机总线。 为此,显卡具有图形卡存储器,并且主机处理器具有以第一存储器配置配置的主机存储器。 图形卡存储器可以被配置在第一存储器配置中,并且图形请求流被直接地从主处理器(经由主机总线)接收到消息中。 在由显卡接收时,图形请求流被写入图形卡存储器。