Method and apparatus for primitive processing in a graphics system
    1.
    发明授权
    Method and apparatus for primitive processing in a graphics system 有权
    用于在图形系统中进行原始处理的方法和装置

    公开(公告)号:US06967664B1

    公开(公告)日:2005-11-22

    申请号:US09552932

    申请日:2000-04-20

    IPC分类号: G09G5/02

    CPC分类号: G06T15/30

    摘要: A method and apparatus for processing graphics primitives that includes a trivial discard guard band. Such a trivial discard guard band is used for comparison operations with the vertices of graphics primitives to determine whether the graphics primitives can be trivially discarded such that no further processing of the primitives is performed. The trivial discard guard band may be based on the specific dimensions of primitives such as one-half of the width of the line primitives or the radial dimension of point primitives such that the rasterization area of such primitives is taken into account when trivial discard decisions are performed.

    摘要翻译: 一种用于处理包括平凡丢弃保护带的图形图元的方法和装置。 这样一个平凡的丢弃保护带用于与图形基元的顶点的比较操作,以确定是否可以平均地丢弃图形基元,使得不执行对图元的进一步处理。 平凡的丢弃保护带可以基于诸如线基元的宽度的一半或点基元的径向尺寸的基元的特定尺寸,使得当简单的丢弃决定是这样的原理时,考虑到这样的图元的光栅化区域 执行。

    Method and apparatus to efficiently interpolate polygon attributes in
two dimensions at a prescribed clock rate
    2.
    发明授权
    Method and apparatus to efficiently interpolate polygon attributes in two dimensions at a prescribed clock rate 失效
    以规定的时钟速率有效地在多维属性中插入多边形属性的方法和装置

    公开(公告)号:US6072505A

    公开(公告)日:2000-06-06

    申请号:US53589

    申请日:1998-04-01

    IPC分类号: G06T3/40 G06T1/00 G06F15/00

    CPC分类号: G06T3/403

    摘要: A rasterizer comprised of a bounding box calculator, a plane converter, a windower, and incrementers. For each polygon to be processed, a bounding box calculation is performed which determines the display screen area, in spans, that totally encloses the polygon and passes the data to the plane converter. The plane converter also receives as input attribute values for each vertex of the polygon. The plane converter computes planar coefficients for each attribute of the polygon, for each of the edges of the polygon. The plane converter unit computes the start pixel center location at a start span and a starting coefficient value at that pixel center. The computed coefficients also include the rate of change or gradient, for each polygon attribute in the x and y directions, respectively. The plane converter also computes line coefficients for each of the edges of the polygon. Line equation values are passed through to the windower where further calculations allow the windower to determine which spans are either covered or intersected by the polygon. The incrementers receive the span coverage data from the windower in addition to receiving planar coefficient values from the plane converter. The incrementers utilize the data from both the windower and plane converter to walk or traverse the polygon in those intersected spans, pixel by pixel. As the incrementer visits each pixel, vertex attribute values are interpolated to each pixel.

    摘要翻译: 由边界计算器,平面转换器,加窗器和加法器构成的光栅化器。 对于要处理的每个多边形,执行边界框计算,其确定完全包围多边形并将数据传递到平面转换器的跨度的显示屏幕区域。 平面转换器还接收多边形的每个顶点的输入属性值。 平面转换器为多边形的每个边缘计算多边形的每个属性的平面系数。 平面转换器单元计算开始跨度处的开始像素中心位置和该像素中心处的起始系数值。 所计算的系数也分别包括x和y方向上每个多边形属性的变化率或梯度。 平面转换器还为多边形的每个边缘计算线系数。 线路方程值被传递到风力发电机,进一步的计算允许风轮确定哪个跨度被多边形覆盖或相交。 除了从平面转换器接收平面系数值之外,增量器还接收来自风力发电机的跨距覆盖数据。 增量器利用来自两台风力发电机和平面转换器的数据逐行扫描或横穿那些相交的跨度中的多边形。 随着增量器访问每个像素,顶点属性值被内插到每个像素。

    Processing unit with a plurality of shader engines
    3.
    发明授权
    Processing unit with a plurality of shader engines 有权
    具有多个着色引擎的处理单元

    公开(公告)号:US09142057B2

    公开(公告)日:2015-09-22

    申请号:US12691541

    申请日:2010-01-21

    IPC分类号: G09G5/00 G06T15/00

    CPC分类号: G06T15/005

    摘要: A processor includes a first shader engine and a second shader engine. The first shader engine is configured to process pixel shaders for a first subset of pixels to be displayed on a display device. The second shader engine is configured to process pixel shaders for a second subset of pixels to be displayed on the display device. Both the first and second shader engines are also configured to process general-compute shaders and non-pixel graphics shaders. The processor may also include a level-one (L1) data cache, coupled to and positioned between the first and second shader engines.

    摘要翻译: 处理器包括第一着色引擎和第二着色引擎。 第一个着色引擎被配置为处理要在显示设备上显示的第一子像素的像素着色器。 第二着色引擎被配置为处理要显示在显示设备上的第二像素子像素的像素着色器。 第一和第二着色引擎都配置为处理通用计算着色器和非像素图形着色器。 处理器还可以包括耦合到第一和第二着色引擎之间并定位在第一和第二着色引擎之间的一级(L1)数据高速缓存。

    Interlocked increment memory allocation and access
    4.
    发明授权
    Interlocked increment memory allocation and access 有权
    联锁增量内存分配和访问

    公开(公告)号:US09529632B2

    公开(公告)日:2016-12-27

    申请号:US12553652

    申请日:2009-09-03

    CPC分类号: G06F9/5016

    摘要: A method of allocating a memory to a plurality of concurrent threads is presented. The method includes dynamically determining writer threads each having at least one pending write to the memory; and dynamically allocating respective contiguous blocks in the memory for each of the writer threads. Another method of allocating a memory to a plurality of concurrent threads includes launching the plurality of threads as a plurality of wavefronts, dynamically determining a group of wavefronts each having at least one thread requiring a write to the memory, and dynamically allocating respective contiguous blocks in the memory for each wavefront from the group of wavefronts. A corresponding method of assigning a memory to a plurality of reader threads includes determining a first number corresponding to a number of writer threads having a block allocated in said memory, launching a first number of reader threads, entering a first wavefront of said reader threads from said group of wavefronts to an atomic operation, and assigning a first block in the memory to the first wavefront during the corresponding atomic operation, where the first block is contiguous to a previously allocated block dynamically allocated to another wavefront from said group of wavefronts. Corresponding system embodiments and computer program product embodiments are also presented.

    摘要翻译: 提出了一种向多个并发线程分配存储器的方法。 该方法包括动态地确定写入器线程,每个写入器线程具有至少一个待写入到存储器的 并为每个写入器线程动态地分配存储器中的相应连续块。 向多个并发线程分配存储器的另一种方法包括:将多个线程作为多个波前发射,动态地确定每组具有至少一个需要对存储器进行写入的线程的波前组,以及动态分配相应的连续块 从波浪组的每个波前的记忆。 将存储器分配给多个读取器线程的相应方法包括确定与具有在所述存储器中分配的块的写入器线程数相对应的第一数字,启动第一数量的读取器线程,从所述读取器线程的第一波前输入 所述波束组合到原子操作,并且在相应的原子操作期间将第一块在存储器中分配给第一波阵面,其中第一块与先前分配的块相邻,所述块先前从所述波阵组中分配给另一波阵面。 还提出了相应的系统实施例和计算机程序产品实施例。

    Multithreaded Computing
    6.
    发明申请
    Multithreaded Computing 审中-公开
    多线程计算

    公开(公告)号:US20130191852A1

    公开(公告)日:2013-07-25

    申请号:US13606741

    申请日:2012-09-07

    IPC分类号: G06F9/46

    CPC分类号: G06F9/542 G06F9/4843

    摘要: A system, method, and computer program product are provided for improving resource utilization of multithreaded applications. Rather than requiring threads to block while waiting for data from a channel or requiring context switching to minimize blocking, the techniques disclosed herein provide an event-driven approach to launch kernels only when needed to perform operations on channel data, and then terminate in order to free resources. These operations are handled efficiently in hardware, but are flexible enough to be implemented in all manner of programming models.

    摘要翻译: 提供了一种系统,方法和计算机程序产品,用于提高多线程应用程序的资源利用率。 而不是要求线程在等待来自信道的数据或需要上下文切换以最小化阻塞的情况下阻塞,所以本文所公开的技术提供了仅当需要对信道数据执行操作时启动内核的事件驱动方法,然后终止以便 免费资源。 这些操作在硬件中有效地处理,但是具有足够的灵活性,可以在各种编程模型中实现。

    Apparatus with redundant circuitry and method therefor
    9.
    发明授权
    Apparatus with redundant circuitry and method therefor 有权
    具有冗余电路的装置及其方法

    公开(公告)号:US08281183B2

    公开(公告)日:2012-10-02

    申请号:US12509803

    申请日:2009-07-27

    IPC分类号: G06F11/00

    摘要: An apparatus with circuit redundancy includes a set of parallel arithmetic logic units (ALUs), a redundant parallel ALU, input data shifting logic that is coupled to the set of parallel ALUs and that is operatively coupled to the redundant parallel ALU. The input data shifting logic shifts input data for a defective ALU, in a first direction, to a neighboring ALU in the set. When the neighboring ALU is the last or end ALU in the set, the shifting logic continues to shift the input data for the end ALU that is not defective, to the redundant parallel ALU. The redundant parallel ALU then operates for the defective ALU. Output data shifting logic is coupled to an output of the parallel redundant ALU and all other ALU outputs to shift the output data in a second and opposite direction than the input shifting logic, to realign output of data for continued processing, including for storage or for further processing by other circuitry.

    摘要翻译: 具有电路冗余的装置包括一组并行算术逻辑单元(ALU),冗余并行ALU,输入数据移位逻辑,其耦合到该组并行ALU并且可操作地耦合到冗余并行ALU。 输入数据移位逻辑将有缺陷的ALU的输入数据沿第一方向移动到该组中的相邻ALU。 当相邻的ALU是组中的最后一个或结束ALU时,移位逻辑继续将没有故障的结束ALU的输入数据移动到冗余并行ALU。 冗余的并行ALU然后对有缺陷的ALU进行操作。 输出数据移位逻辑耦合到并行冗余ALU和所有其他ALU输出的输出,以使输出数据在与输入移位逻辑相反的方向上相反的方向上移位,以重新输出用于继续处理的数据输出,包括用于存储或用于 由其他电路进一步处理。

    GRAPHICS PROCESSING LOGIC WITH VARIABLE ARITHMETIC LOGIC UNIT CONTROL AND METHOD THEREFOR
    10.
    发明申请
    GRAPHICS PROCESSING LOGIC WITH VARIABLE ARITHMETIC LOGIC UNIT CONTROL AND METHOD THEREFOR 有权
    具有可变算术逻辑单元控制的图形处理逻辑及其方法

    公开(公告)号:US20060053189A1

    公开(公告)日:2006-03-09

    申请号:US11161674

    申请日:2005-08-11

    申请人: Michael Mantor

    发明人: Michael Mantor

    IPC分类号: G06F7/38

    摘要: Briefly, graphics data processing logic includes a plurality of parallel arithmetic logic units (ALUs), such as floating point processors or any other suitable logic, that operate as a vector processor on at least one of pixel data and vertex data (or both) and a programmable storage element that contains data representing which of the plurality of arithmetic logic units are not to receive data for processing. The graphics data processing logic also includes parallel ALU data packing logic that is operatively coupled to the plurality of arithmetic logic processing units and to the programmable storage element to pack data only for the plurality of arithmetic logic units identified by the data in the programmable storage element as being enabled.

    摘要翻译: 简而言之,图形数据处理逻辑包括多个并行算术逻辑单元(ALU),诸如浮点处理器或任何其它合适的逻辑,其在像素数据和顶点数据(或两者)中的至少一个上作为向量处理器,以及 可编程存储元件,其包含表示多个算术逻辑单元中的哪一个不接收用于处理的数据的数据。 图形数据处理逻辑还包括并行ALU数据打包逻辑,其可操作地耦合到多个算术逻辑处理单元和可编程存储元件,以仅对由可编程存储元件中的数据标识的多个算术逻辑单元打包数据 被启用。