Method, apparatus and article of manufacture for a vertex attribute buffer in a graphics processor
    51.
    发明授权
    Method, apparatus and article of manufacture for a vertex attribute buffer in a graphics processor 有权
    用于图形处理器中的顶点属性缓冲器的方法,装置和制造

    公开(公告)号:US06515671B1

    公开(公告)日:2003-02-04

    申请号:US09454525

    申请日:1999-12-06

    IPC分类号: G06T120

    CPC分类号: G06T1/60 G06T15/005

    摘要: A method, apparatus and article of manufacture are provided for managing vertex data in a vertex buffer. First, vertex data is received and stored in the vertex buffer. Thereafter, the vertex data is outputted from the vertex buffer to a processing module. During operation, a plurality of command bits is passed from the vertex buffer for determining a manner in which the vertex data is inputted and processed in the input buffer of the processing module. Such command bits are received from a command bit source. Further, a plurality of mode bits indicative of a status of a plurality of modes of process operations is passed. Such mode bits are received from a mode bit source. The mode bits are adapted for determining a manner in which the vertex data is processed in the processing module.

    摘要翻译: 提供了一种用于管理顶点缓冲器中的顶点数据的方法,装置和制品。 首先,顶点数据被接收并存储在顶点缓冲器中。 此后,顶点数据从顶点缓冲器输出到处理模块。 在操作期间,从顶点缓冲器传送多个命令位,以确定在处理模块的输入缓冲器中输入和处理顶点数据的方式。 这样的命令位从命令位源接收。 此外,通过表示多种处理操作模式的状态的多个模式比特。 从模式位源接收这样的模式位。 模式位适于确定在处理模块中处理顶点数据的方式。

    Dispatching of instructions for execution by heterogeneous processing engines
    52.
    发明授权
    Dispatching of instructions for execution by heterogeneous processing engines 有权
    调度由异构处理引擎执行的指令

    公开(公告)号:US09304775B1

    公开(公告)日:2016-04-05

    申请号:US11935266

    申请日:2007-11-05

    IPC分类号: G06F9/38

    摘要: An embodiment of a computing system is configured to process data using a multithreaded SIMD architecture that includes heterogeneous processing engines to execute a program. The program is constructed of various program instructions. A first type of the program instructions can only be executed by a first type of processing engine and a second type of program instructions can only be executed by a second type of processing engine. A third type of program instructions can be executed by the first and the second type of processing engines. An instruction dispatcher is configured to identify and remove program instruction execution conflicts for the heterogeneous processing engines to improve instruction execution throughput.

    摘要翻译: 计算系统的实施例被配置为使用包括异构处理引擎来执行程序的多线程SIMD架构来处理数据。 该程序由各种程序指令构成。 第一类型的程序指令只能由第一类型的处理引擎执行,并且第二类型的程序指令只能由第二类型的处理引擎执行。 第三种类型的程序指令可以由第一类和第二类处理引擎执行。 指令调度器被配置为识别和去除异构处理引擎的程序指令执行冲突,以改善指令执行吞吐量。

    Credit-based streaming multiprocessor warp scheduling
    53.
    发明授权
    Credit-based streaming multiprocessor warp scheduling 有权
    基于信用流的多处理器扭曲调度

    公开(公告)号:US09189242B2

    公开(公告)日:2015-11-17

    申请号:US12885299

    申请日:2010-09-17

    IPC分类号: G06F9/50 G06F9/38

    摘要: One embodiment of the present invention sets forth a technique for ensuring cache access instructions are scheduled for execution in a multi-threaded system to improve cache locality and system performance. A credit-based technique may be used to control instruction by instruction scheduling for each warp in a group so that the group of warps is processed uniformly. A credit is computed for each warp and the credit contributes to a weight for each warp. The weight is used to select instructions for the warps that are issued for execution.

    摘要翻译: 本发明的一个实施例提出了一种用于确保高速缓存访​​问指令被调度用于在多线程系统中执行以提高高速缓存位置和系统性能的技术。 可以使用基于信用的技术来对组中的每个翘曲的指令调度来控制指令,使得一组经线被均匀地处理。 对每个经纱计算信用额度,并且信用额度有助于每个经线的权重。 权重用于选择要执行的经纱的说明。

    Parallel array architecture for a graphics processor
    54.
    发明授权
    Parallel array architecture for a graphics processor 有权
    用于图形处理器的并行阵列架构

    公开(公告)号:US08730249B2

    公开(公告)日:2014-05-20

    申请号:US13269462

    申请日:2011-10-07

    摘要: A parallel array architecture for a graphics processor includes a multithreaded core array including a plurality of processing clusters, each processing cluster including at least one processing core operable to execute a pixel shader program that generates pixel data from coverage data; a rasterizer configured to generate coverage data for each of a plurality of pixels; and pixel distribution logic configured to deliver the coverage data from the rasterizer to one of the processing clusters in the multithreaded core array. A crossbar coupled to each of the processing clusters is configured to deliver pixel data from the processing clusters to a frame buffer having a plurality of partitions.

    摘要翻译: 用于图形处理器的并行阵列架构包括包括多个处理簇的多线程核心阵列,每个处理簇包括至少一个可操作以执行从覆盖数据生成像素数据的像素着色器程序的处理核心; 光栅化器,被配置为生成多个像素中的每一个的覆盖数据; 以及像素分布逻辑,被配置为将覆盖数据从光栅化器传送到多线程核心阵列中的处理集群之一。 耦合到每个处理集群的交叉开关被配置为将像素数据从处理集群传送到具有多个分区的帧缓冲器。

    Programmable graphics processor for multithreaded execution of programs
    55.
    发明授权
    Programmable graphics processor for multithreaded execution of programs 有权
    用于多线程执行程序的可编程图形处理器

    公开(公告)号:US08405665B2

    公开(公告)日:2013-03-26

    申请号:US13466043

    申请日:2012-05-07

    CPC分类号: G06T15/005

    摘要: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

    摘要翻译: 处理单元包括多个执行流水线,每个执行流水线连接到第一输入部分,用于接收用于像素处理的输入数据和用于接收用于顶点处理的输入数据的第二输入部分和用于存储经处理的像素数据的第一输出部分和 用于存储经处理的顶点数据的第二输出部分。 经处理的顶点数据被光栅化并扫描转换为用作像素处理的输入数据的像素数据。 经处理的像素数据被输出到光栅分析器。

    Generating clip state for a batch of vertices

    公开(公告)号:US08384736B1

    公开(公告)日:2013-02-26

    申请号:US12579348

    申请日:2009-10-14

    IPC分类号: G09G5/00

    CPC分类号: G06T15/30

    摘要: One embodiment of the present invention sets forth a technique for generating a batch clip state stored in clip state machine (CSM) associated with a batch of vertices. Per-vertex clip state is generated for each vertex in the batch of vertices based on the position of each vertex relative to each clip plane. For a given vertex, per-vertex clip state indicates whether the vertex is inside or outside each of the one or more clip planes. The per-vertex clip states of all the vertices in the batch of vertices are coalesced into a batch clip state by determining whether each vertex in the batch of vertices is inside every clip plane, each vertex is outside at least one clip plane or neither. The batch clip state is stored in the CSM associated with the thread group that processes the batch of vertices that can be accessed by further stages of the graphics pipeline.

    INSTRUCTION EXECUTION BASED ON OUTSTANDING LOAD OPERATIONS
    57.
    发明申请
    INSTRUCTION EXECUTION BASED ON OUTSTANDING LOAD OPERATIONS 审中-公开
    基于超越负载运行的指令执行

    公开(公告)号:US20120079241A1

    公开(公告)日:2012-03-29

    申请号:US13242562

    申请日:2011-09-23

    IPC分类号: G06F9/30 G06F9/312

    摘要: One embodiment of the present invention sets forth a technique for scheduling thread execution in a multi-threaded processing environment. A two-level scheduler maintains a small set of active threads called strands to hide function unit pipeline latency and local memory access latency. The strands are a sub-set of a larger set of pending threads that is also maintained by the two-leveler scheduler. Pending threads are promoted to strands and strands are demoted to pending threads based on latency characteristics, such as whether outstanding load operations have been executed. The longer latency of the pending threads is hidden by selecting strands for execution. When the latency for a pending thread is expired, the pending thread may be promoted to a strand and begin (or resume) execution. When a strand encounters a latency event, the strand may be demoted to a pending thread while the latency is incurred.

    摘要翻译: 本发明的一个实施例提出了一种用于在多线程处理环境中调度线程执行的技术。 一个两级调度程序维护一组称为线索的活动线程,以隐藏功能单元流水线延迟和本地存储器访问延迟。 这些链是一组更大的待处理线程的子集,其也由二级调度器维护。 等待线程被提升到线束,并且基于等待时间特征(例如是否执行了未完成的加载操作)将线​​索降级到等待线程。 通过选择要执行的链来隐藏待处理线程的延迟更长。 当待处理线程的等待时间到期时,挂起的线程可以被提升为一个线并开始(或恢复)执行。 当一条线遇到一个延迟事件时,该链可以被降级到等待线程,同时发生延迟。

    HIERARCHICAL PROCESSOR ARRAY
    58.
    发明申请
    HIERARCHICAL PROCESSOR ARRAY 有权
    分层处理器阵列

    公开(公告)号:US20120026175A1

    公开(公告)日:2012-02-02

    申请号:US13270215

    申请日:2011-10-10

    IPC分类号: G06T1/00

    摘要: Apparatuses and methods are presented for a hierarchical processor. The processor comprises, at a first level of hierarchy, a plurality of similarly structured first level components, wherein each of the plurality of similarly structured first level components includes at least one combined function module capable of performing multiple classes of graphics operations, each of the multiple classes of graphics operations being associated with a different stage of graphics processing. The processor comprises, at a second level of hierarchy, a plurality of similarly structured second level components positioned within each one of the plurality of similarly structured first level components, wherein each of the plurality of similarly structured second level components is capable of carrying out different operations from the multiple classes of graphics operations, wherein each first level component is adapted to distribute work to the plurality of similarly structured second level components positioned within the first level component.

    摘要翻译: 为分级处理器提供了设备和方法。 所述处理器在第一级别包括多个类似结构的第一级组件,其中所述多个类似结构的第一级组件中的每一个包括能够执行多类图形操作的至少一个组合功能模块,每个组件 多种图形操作与不同阶段的图形处理相关联。 处理器在第二层次上包括定位在多个类似结构的第一级组件中的每一个内的多个类似结构的第二级组件,其中多个类似结构的第二级组件中的每一个能够执行不同的 来自多类图形操作的操作,其中每个第一级组件适于将工作分配到定位在第一级组件内的多个相似结构的第二级组件。

    Thread-type-based resource allocation in a multithreaded processor
    59.
    发明授权
    Thread-type-based resource allocation in a multithreaded processor 有权
    多线程处理器中基于线程类型的资源分配

    公开(公告)号:US08108872B1

    公开(公告)日:2012-01-31

    申请号:US11552109

    申请日:2006-10-23

    IPC分类号: G06F9/46

    摘要: Resources to be used by concurrent threads in a multithreaded processor are allocated based on thread types of the threads. For each of at least two thread types, an amount of the resource is reserved, and amounts currently allocated are tracked. When a request to allocate some of the resource to a new thread is received, a determination as to whether the allocation can be made is based on the thread type of the new thread, the amount of the resource reserved for that thread type, and the amount currently allocated to threads of that type.

    摘要翻译: 多线程处理器中并发线程使用的资源将根据线程的线程类型进行分配。 对于至少两个线程类型中的每一个,资源的数量被保留,并且跟踪当前分配的数量。 当接收到将一些资源分配给新线程的请求时,可以根据新线程的线程类型,为该线程类型保留的资源量以及 当前分配给该类型线程的数量。

    Thread-type-based load balancing in a multithreaded processor
    60.
    发明授权
    Thread-type-based load balancing in a multithreaded processor 有权
    多线程处理器中基于线程类的负载平衡

    公开(公告)号:US08087029B1

    公开(公告)日:2011-12-27

    申请号:US11552113

    申请日:2006-10-23

    IPC分类号: G06F9/46 G06F11/00

    摘要: Resources to be used by concurrent threads in a multithreaded processor are allocated based on thread types of the threads, and thread-type-based criteria governing resource allocation decisions are dynamically modified based on feedback information indicating the degree to which various thread types are using the resource. For each of at least two thread types, an amount of the resource is reserved, and amounts currently allocated are tracked. When an allocation request for a new thread is received, the allocation is made or not based on the new thread's type, the amount of the resource reserved for that type, and the amount currently allocated to threads of that type. If, based on feedback information from the allocation decision, the amount of the resource reserved for one thread type is determined to be insufficient, the reserved amounts are modified to better meet the demand.

    摘要翻译: 基于线程的线程类型分配多线程处理器中的并发线程使用的资源,并且基于指示各种线程类型正在使用的线程类型的程度的反馈信息来动态修改管理资源分配决策的线程类型的准则 资源。 对于至少两个线程类型中的每一个,资源的数量被保留,并且跟踪当前分配的数量。 当接收到针对新线程的分配请求时,将基于新线程的类型,为该类型保留的资源量以及当前分配给该线程的数量的数量进行分配。 如果基于来自分配决定的反馈信息,为一个线程类型预留的资源量被确定为不足,则修改保留量以更好地满足需求。