Method and apparatus for multithreaded processing of data in a programmable graphics processor
    31.
    发明授权
    Method and apparatus for multithreaded processing of data in a programmable graphics processor 有权
    用于可编程图形处理器中数据的多线程处理的方法和装置

    公开(公告)号:US07015913B1

    公开(公告)日:2006-03-21

    申请号:US10608346

    申请日:2003-06-27

    摘要: A graphics processor and method for executing a graphics program as a plurality of threads where each sample to be processed by the program is assigned to a thread. Although threads share processing resources within the programmable graphics processor, the execution of each thread can proceed independent of any other threads. For example, instructions in a second thread are scheduled for execution while execution of instructions in a first thread are stalled waiting for source data. Consequently, a first received sample (assigned to the first thread) may be processed after a second received sample (assigned to the second thread). A benefit of independently executing each thread is improved performance because a stalled thread does not prevent the execution of other threads.

    摘要翻译: 一种用于执行图形程序作为多个线程的图形处理器和方法,其中由程序处理的每个样本被分配给线程。 虽然线程在可编程图形处理器内共享处理资源,但每个线程的执行可以独立于任何其他线程进行。 例如,第二线程中的指令被调度为执行,而第一线程中的指令的执行被停止等待源数据。 因此,可以在第二个接收到的样本(分配给第二个线程)之后处理第一个接收到的样本(分配给第一个线程)。 独立执行每个线程的好处是提高了性能,因为停滞的线程不会阻止其他线程的执行。

    Method, apparatus and article of manufacture for a vertex attribute buffer in a graphics processor
    35.
    发明授权
    Method, apparatus and article of manufacture for a vertex attribute buffer in a graphics processor 有权
    用于图形处理器中的顶点属性缓冲器的方法,装置和制造

    公开(公告)号:US06515671B1

    公开(公告)日:2003-02-04

    申请号:US09454525

    申请日:1999-12-06

    IPC分类号: G06T120

    CPC分类号: G06T1/60 G06T15/005

    摘要: A method, apparatus and article of manufacture are provided for managing vertex data in a vertex buffer. First, vertex data is received and stored in the vertex buffer. Thereafter, the vertex data is outputted from the vertex buffer to a processing module. During operation, a plurality of command bits is passed from the vertex buffer for determining a manner in which the vertex data is inputted and processed in the input buffer of the processing module. Such command bits are received from a command bit source. Further, a plurality of mode bits indicative of a status of a plurality of modes of process operations is passed. Such mode bits are received from a mode bit source. The mode bits are adapted for determining a manner in which the vertex data is processed in the processing module.

    摘要翻译: 提供了一种用于管理顶点缓冲器中的顶点数据的方法,装置和制品。 首先,顶点数据被接收并存储在顶点缓冲器中。 此后,顶点数据从顶点缓冲器输出到处理模块。 在操作期间,从顶点缓冲器传送多个命令位,以确定在处理模块的输入缓冲器中输入和处理顶点数据的方式。 这样的命令位从命令位源接收。 此外,通过表示多种处理操作模式的状态的多个模式比特。 从模式位源接收这样的模式位。 模式位适于确定在处理模块中处理顶点数据的方式。

    Dispatching of instructions for execution by heterogeneous processing engines
    37.
    发明授权
    Dispatching of instructions for execution by heterogeneous processing engines 有权
    调度由异构处理引擎执行的指令

    公开(公告)号:US09304775B1

    公开(公告)日:2016-04-05

    申请号:US11935266

    申请日:2007-11-05

    IPC分类号: G06F9/38

    摘要: An embodiment of a computing system is configured to process data using a multithreaded SIMD architecture that includes heterogeneous processing engines to execute a program. The program is constructed of various program instructions. A first type of the program instructions can only be executed by a first type of processing engine and a second type of program instructions can only be executed by a second type of processing engine. A third type of program instructions can be executed by the first and the second type of processing engines. An instruction dispatcher is configured to identify and remove program instruction execution conflicts for the heterogeneous processing engines to improve instruction execution throughput.

    摘要翻译: 计算系统的实施例被配置为使用包括异构处理引擎来执行程序的多线程SIMD架构来处理数据。 该程序由各种程序指令构成。 第一类型的程序指令只能由第一类型的处理引擎执行,并且第二类型的程序指令只能由第二类型的处理引擎执行。 第三种类型的程序指令可以由第一类和第二类处理引擎执行。 指令调度器被配置为识别和去除异构处理引擎的程序指令执行冲突,以改善指令执行吞吐量。

    Parallel array architecture for a graphics processor
    39.
    发明授权
    Parallel array architecture for a graphics processor 有权
    用于图形处理器的并行阵列架构

    公开(公告)号:US08730249B2

    公开(公告)日:2014-05-20

    申请号:US13269462

    申请日:2011-10-07

    摘要: A parallel array architecture for a graphics processor includes a multithreaded core array including a plurality of processing clusters, each processing cluster including at least one processing core operable to execute a pixel shader program that generates pixel data from coverage data; a rasterizer configured to generate coverage data for each of a plurality of pixels; and pixel distribution logic configured to deliver the coverage data from the rasterizer to one of the processing clusters in the multithreaded core array. A crossbar coupled to each of the processing clusters is configured to deliver pixel data from the processing clusters to a frame buffer having a plurality of partitions.

    摘要翻译: 用于图形处理器的并行阵列架构包括包括多个处理簇的多线程核心阵列,每个处理簇包括至少一个可操作以执行从覆盖数据生成像素数据的像素着色器程序的处理核心; 光栅化器,被配置为生成多个像素中的每一个的覆盖数据; 以及像素分布逻辑,被配置为将覆盖数据从光栅化器传送到多线程核心阵列中的处理集群之一。 耦合到每个处理集群的交叉开关被配置为将像素数据从处理集群传送到具有多个分区的帧缓冲器。

    Generating clip state for a batch of vertices

    公开(公告)号:US08384736B1

    公开(公告)日:2013-02-26

    申请号:US12579348

    申请日:2009-10-14

    IPC分类号: G09G5/00

    CPC分类号: G06T15/30

    摘要: One embodiment of the present invention sets forth a technique for generating a batch clip state stored in clip state machine (CSM) associated with a batch of vertices. Per-vertex clip state is generated for each vertex in the batch of vertices based on the position of each vertex relative to each clip plane. For a given vertex, per-vertex clip state indicates whether the vertex is inside or outside each of the one or more clip planes. The per-vertex clip states of all the vertices in the batch of vertices are coalesced into a batch clip state by determining whether each vertex in the batch of vertices is inside every clip plane, each vertex is outside at least one clip plane or neither. The batch clip state is stored in the CSM associated with the thread group that processes the batch of vertices that can be accessed by further stages of the graphics pipeline.