System, method and article of manufacture for a programmable processing model with instruction set
    2.
    发明授权
    System, method and article of manufacture for a programmable processing model with instruction set 有权
    具有指令集的可编程处理模型的系统,方法和制造

    公开(公告)号:US08259122B1

    公开(公告)日:2012-09-04

    申请号:US11942577

    申请日:2007-11-19

    IPC分类号: G09G5/00 G06T1/00

    CPC分类号: G06T15/005 G06T15/503

    摘要: A system, method and article of manufacture are provided for programmable processing in a computer graphics pipeline. Initially, data is received from a source buffer. Thereafter, programmable operations are performed on the data in order to generate output. The operations are programmable in that a user may utilize instructions from a predetermined instruction set for generating the same. Such output is stored in a register. During operation, the output stored in the register is used in performing the programmable operations on the data.

    摘要翻译: 提供了一种用于计算机图形管线中的可编程处理的系统,方法和制造物品。 最初,从源缓冲区接收数据。 此后,对数据执行可编程操作以产生输出。 操作是可编程的,因为用户可以利用来自预定指令集的指令来产生它们。 这样的输出被存储在寄存器中。 在运行期间,存储在寄存器中的输出用于对数据执行可编程操作。

    Alignment and ordering of vector elements for single instruction multiple data processing
    3.
    发明授权
    Alignment and ordering of vector elements for single instruction multiple data processing 有权
    用于单指令多数据处理的向量元素的对齐和排序

    公开(公告)号:US07793077B2

    公开(公告)日:2010-09-07

    申请号:US11702659

    申请日:2007-02-06

    IPC分类号: G06F9/34

    摘要: The present invention provides alignment and ordering of vector elements for SIMD processing. In the alignment of vector elements for SIMD processing, one vector is loaded from a memory unit into a first register and another vector is loaded from the memory unit into a second register. The first vector contains a first byte of an aligned vector to be generated. Then, a starting byte specifying the first byte of an aligned vector is determined. Next, a vector is extracted from the first register and the second register beginning from the first bit in the first byte of the first register continuing through the bits in the second register. Finally, the extracted vector is replicated into a third register such that the third register contains a plurality of elements aligned for SIMD processing. In the ordering of vector elements for SIMD processing, a first vector is loaded from a memory unit into a first register and a second vector is loaded from the memory unit into a second register. Then, a subset of elements are selected from the first register and the second register. The elements from the subset are then replicated into the elements in the third register in a particular order suitable for subsequent SIMD vector processing.

    摘要翻译: 本发明提供用于SIMD处理的向量元素的对准和排序。 在用于SIMD处理的向量元素的对齐中,一个向量从存储器单元加载到第一寄存器中,另一个向量从存储器单元加载到第二寄存器中。 第一个向量包含要生成的对齐向量的第一个字节。 然后,确定指定对齐向量的第一个字节的起始字节。 接下来,从第一寄存器提取向量,并且从第一寄存器的第一字节的第一位开始的第二寄存器继续通过第二寄存器中的位。 最后,将所提取的矢量复制到第三寄存器中,使得第三寄存器包含对准用于SIMD处理的多个元素。 在用于SIMD处理的向量元素的排序中,将第一向量从存储器单元加载到第一寄存器中,并且将第二向量从存储器单元加载到第二寄存器中。 然后,从第一寄存器和第二寄存器中选择元件的子集。 然后将来自子集的元素以适合于随后的SIMD向量处理的特定顺序复制到第三寄存器中的元素中。

    Mapping memory partitions to virtual memory pages
    4.
    发明授权
    Mapping memory partitions to virtual memory pages 有权
    将内存分区映射到虚拟内存页面

    公开(公告)号:US07620793B1

    公开(公告)日:2009-11-17

    申请号:US11467679

    申请日:2006-08-28

    摘要: Systems and methods for addressing memory using non-power-of-two virtual memory page sizes improve graphics memory bandwidth by distributing graphics data for efficient access during rendering. Various partition strides may be selected for each virtual memory page to modify the number of sequential addresses mapped to each physical memory partition and change the interleaving granularity. The addressing scheme allows for modification of a bank interleave pattern for each virtual memory page to reduce bank conflicts and improve memory bandwidth utilization. The addressing scheme also allows for modification of a partition interleave pattern for each virtual memory page to distribute accesses amongst multiple partitions and improve memory bandwidth utilization.

    摘要翻译: 使用非二功能虚拟内存页大小寻址内存的系统和方法通过在渲染过程中分配图形数据进行高效访问来提高图形内存带宽。 可以为每个虚拟存储器页面选择各种分段步长,以修改映射到每个物理存储器分区的顺序地址的数量并改变交织粒度。 寻址方案允许修改每个虚拟存储器页面的存储体交织模式以减少存储体冲突并提高存储器带宽利用率。 寻址方案还允许修改每个虚拟存储器页面的分区交织模式以分布多个分区之间的访问并提高存储器带宽利用率。

    VIRTUAL ARCHITECTURE AND INSTRUCTION SET FOR PARALLEL THREAD COMPUTING
    5.
    发明申请
    VIRTUAL ARCHITECTURE AND INSTRUCTION SET FOR PARALLEL THREAD COMPUTING 有权
    虚拟架构和平行线程计算的指令集

    公开(公告)号:US20080184211A1

    公开(公告)日:2008-07-31

    申请号:US11627892

    申请日:2007-01-26

    IPC分类号: G06F9/45

    CPC分类号: G06F8/456

    摘要: A virtual architecture and instruction set support explicit parallel-thread computing. The virtual architecture defines a virtual processor that supports concurrent execution of multiple virtual threads with multiple levels of data sharing and coordination (e.g., synchronization) between different virtual threads, as well as a virtual execution driver that controls the virtual processor. A virtual instruction set architecture for the virtual processor is used to define behavior of a virtual thread and includes instructions related to parallel thread behavior, e.g., data sharing and synchronization. Using the virtual platform, programmers can develop application programs in which virtual threads execute concurrently to process data; virtual translators and drivers adapt the application code to particular hardware on which it is to execute, transparently to the programmer.

    摘要翻译: 虚拟架构和指令集支持显式并行线程计算。 虚拟架构定义了支持多个虚拟线程的并行执行的虚拟处理器,该多个虚拟线程具有不同虚拟线程之间的多级数据共享和协调(例如,同步),以及控制虚拟处理器的虚拟执行驱动器。 用于虚拟处理器的虚拟指令集架构用于定义虚拟线程的行为,并且包括与并行线程行为相关的指令,例如数据共享和同步。 使用虚拟平台,程序员可以开发虚拟线程同时执行以处理数据的应用程序; 虚拟翻译器和驱动程序将应用程序代码调整到要执行的特定硬件,对程序员是透明的。

    Neighbor and edge indexing
    6.
    发明授权
    Neighbor and edge indexing 有权
    邻居和边缘索引

    公开(公告)号:US07324105B1

    公开(公告)日:2008-01-29

    申请号:US10727679

    申请日:2003-12-04

    IPC分类号: G06T17/00 G06T17/20

    CPC分类号: G06T17/20

    摘要: Method and apparatus for neighbor and edge indexing is described. A vertex is identified and assigned a reference. One-ring neighbor vertices of the vertex are identified. The reference is assigned to each of the one-ring neighbor vertices identified. An index to one of the one-ring neighbor vertices is assigned. The index is successively incremented to provide indices for each of the one-ring neighbor vertices remaining. Edge indexing follows as described above, with the vertex and its one-ring neighbors defining end points of edges. Additionally, offset indexing is described, and may be used for a consistent order of computation.

    摘要翻译: 描述了用于邻居和边缘索引的方法和装置。 一个顶点被识别并分配一个参考。 识别顶点的单环邻接顶点。 引用被分配给所识别的每个单环邻居顶点。 分配一个环形邻居顶点之一的索引。 索引依次递增,为剩下的每一环邻近顶点提供索引。 边缘索引如上所述,其中顶点及其单环邻居定义边缘的终点。 另外,描述了偏移索引,并且可以用于一致的计算顺序。

    Primitive extension
    7.
    发明授权
    Primitive extension 有权
    原始延伸

    公开(公告)号:US07196703B1

    公开(公告)日:2007-03-27

    申请号:US10728047

    申请日:2003-12-04

    CPC分类号: G06T17/20

    摘要: Method and apparatus for generating a primitive extension defining a generalized primitive is described. The primitive extension defines the connectivity and vertices used to specify a collection of connected primitives, such as a strip-type or fan-type generalized primitive. A generalized primitive includes a number of vertices where some of the vertices are shared with neighboring primitives. The primitive extension includes an originating primitive, vertex data, and connectivity information. The primitive extension provides a general interface for describing a variety of connected primitives.

    摘要翻译: 描述了用于生成定义广义基元的原始扩展的方法和装置。 原始扩展定义了用于指定连接原语集合的连接性和顶点,例如条带类型或扇形广义原语。 广义原语包括一些顶点,其中一些顶点与相邻基元共享。 原始扩展包括始发原语,顶点数据和连接性信息。 原始扩展提供了用于描述各种连接的原语的通用接口。

    User programmable primitive engine
    8.
    发明授权
    User programmable primitive engine 有权
    用户可编程原始引擎

    公开(公告)号:US06940515B1

    公开(公告)日:2005-09-06

    申请号:US10727814

    申请日:2003-12-03

    IPC分类号: G06T1/00

    CPC分类号: G06T1/20

    摘要: A fixed function engine and method are described for processing a set of primitive commands. One embodiment of the fixed function engine includes a means for receiving one or more primitive commands, where each such primitive command includes information for processing vertex data using a user-developed program or subroutine. The fixed function engine also includes a means for determining a set of related primitive commands from the received primitive commands and a means for identifying a first primitive command to process from that set. In addition, the fixed function engine includes a means for transmitting a first program command, which is related to the first primitive command, to a processing engine for processing.

    摘要翻译: 描述了用于处理一组原始命令的固定功能引擎和方法。 固定功能引擎的一个实施例包括用于接收一个或多个原始命令的装置,其中每个这样的原始命令包括用于使用用户开发的程序或子程序来处理顶点数据的信息。 固定功能引擎还包括用于从接收到的原语命令确定一组相关原语命令的装置和用于识别从该组处理的第一原语命令的装置。 此外,固定功能引擎包括用于将与第一原语命令相关的第一程序命令发送到用于处理的处理引擎的装置。

    Alignment and ordering of vector elements for single instruction multiple data processing
    10.
    发明授权
    Alignment and ordering of vector elements for single instruction multiple data processing 有权
    用于单指令多数据处理的向量元素的对齐和排序

    公开(公告)号:US06266758B1

    公开(公告)日:2001-07-24

    申请号:US09263798

    申请日:1999-03-05

    IPC分类号: G06F1500

    摘要: The present invention provides alignment and ordering of vector elements for SIMD processing. In the alignment of vector elements for SIMD processing, one vector is loaded from a memory unit into a first register and another vector is loaded from the memory unit into a second register. The first vector contains a first byte of an aligned vector to be generated. Then, a starting byte specifying the first byte of an aligned vector is determined. Next, a vector is extracted from the first register and the second register beginning from the first bit in the first byte of the first register continuing through the bits in the second register. Finally, the extracted vector is replicated into a third register such that the third register contains a plurality of elements aligned for SIMD processing. In the ordering of vector elements for SIMD processing, a first vector is loaded from a memory unit into a first register and a second vector is loaded from the memory unit into a second register. Then, a subset of elements are selected from the first register and the second register. The elements from the subset are then replicated into the elements in the third register in a particular order suitable for subsequent SIMD vector processing.

    摘要翻译: 本发明提供用于SIMD处理的向量元素的对准和排序。 在用于SIMD处理的向量元素的对齐中,一个向量从存储器单元加载到第一寄存器中,另一个向量从存储器单元加载到第二寄存器中。 第一个向量包含要生成的对齐向量的第一个字节。 然后,确定指定对齐向量的第一个字节的起始字节。 接下来,从第一寄存器提取向量,并且从第一寄存器的第一字节的第一位开始的第二寄存器继续通过第二寄存器中的位。 最后,将所提取的矢量复制到第三寄存器中,使得第三寄存器包含对准用于SIMD处理的多个元素。 在用于SIMD处理的向量元素的排序中,将第一向量从存储器单元加载到第一寄存器中,并且将第二向量从存储器单元加载到第二寄存器中。 然后,从第一寄存器和第二寄存器中选择元件的子集。 然后将来自子集的元素以适合于随后的SIMD向量处理的特定顺序复制到第三寄存器中的元素中。