BLOCK LINEAR MEMORY ORDERING OF TEXTURE DATA
    11.
    发明申请
    BLOCK LINEAR MEMORY ORDERING OF TEXTURE DATA 有权
    块状数据的线性记忆命令

    公开(公告)号:US20110169850A1

    公开(公告)日:2011-07-14

    申请号:US13073020

    申请日:2011-03-28

    IPC分类号: G06T11/40

    CPC分类号: G06T15/04 G06T1/60

    摘要: A method of organizing memory for storage of texture data, in accordance with one embodiment of the invention, includes accessing a size of a mipmap level of a texture map. A block dimension may be determined based on the size the mipmap level. A memory space (e.g., computer-readable medium) may be logically divided into a plurality of whole number of blocks of variable dimension. The dimension of the blocks is measured in units of gobs and each gob is of a fixed dimension of bytes. A mipmap level of a texture map may be stored in the memory space. A texel coordinate of said mipmap level may be converted into a byte address of the memory space by determining a gob address of a gob in which the texel coordinate resides and determining a byte address within the particular gob.

    摘要翻译: 根据本发明的一个实施例的组织用于存储纹理数据的存储器的方法包括访问纹理映射的mipmap级别的大小。 可以基于mipmap级别的大小来确定块维度。 存储器空间(例如,计算机可读介质)可以在逻辑上被划分为多个整数个可变维度的块。 块的尺寸以料滴为单位进行测量,每个料滴的固定尺寸为字节。 纹理映射的mipmap级别可以存储在存储器空间中。 所述mipmap级别的纹理坐标可以通过确定纹理坐标所驻留的料滴的料滴地址并确定特定料滴中的字节地址来转换为存储器空间的字节地址。

    Fragment processor having dual mode register file
    12.
    发明授权
    Fragment processor having dual mode register file 有权
    片段处理器具有双模式寄存器文件

    公开(公告)号:US07821520B1

    公开(公告)日:2010-10-26

    申请号:US11009471

    申请日:2004-12-10

    IPC分类号: G09G5/36 G06T1/20 G06F15/16

    CPC分类号: G06T1/20

    摘要: A new, useful, and non-obvious shader processor architecture having a shader register file that acts both as an internal storage register file for temporarily storing data within the shader processor and as a First-In First-Out (FIFO) buffer for a subsequent module. Some embodiments include automatic, programmable hardware conversion between numeric formats, for example, between floating point data and fixed point data.

    摘要翻译: 一种新的,有用的和不可见的着色器处理器架构,其具有着色器寄存器文件,其既用作内部存储寄存器文件,用于在着色器处理器内临时存储数据,并作为先进先出(FIFO)缓冲器,用于后续 模块。 一些实施例包括数字格式之间的自动,可编程硬件转换,例如在浮点数据和固定点数据之间。

    Block linear memory ordering of texture data
    13.
    发明授权
    Block linear memory ordering of texture data 有权
    阻止纹理数据的线性存储器排序

    公开(公告)号:US07916149B1

    公开(公告)日:2011-03-29

    申请号:US11029940

    申请日:2005-01-04

    IPC分类号: G06T11/40

    CPC分类号: G06T15/04 G06T1/60

    摘要: A method of organizing memory for storage of texture data, in accordance with one embodiment of the invention, includes accessing a size of a mipmap level of a texture map. A block dimension may be determined based on the size of the mipmap level. A memory space (e.g., computer-readable medium) may be logically divided into a plurality of whole number of blocks of variable dimension. The dimension of the blocks is measured in units of gobs and each gob is of a fixed dimension of bytes. A mipmap level of a texture map may be stored in the memory space. A texel coordinate of said mipmap level may be converted into a byte address of the memory space by determining a gob address of a gob in which the texel coordinate resides and determining a byte address within the particular gob.

    摘要翻译: 根据本发明的一个实施例的组织用于存储纹理数据的存储器的方法包括访问纹理映射的mipmap级别的大小。 可以基于mipmap级别的大小来确定块维度。 存储器空间(例如,计算机可读介质)可以在逻辑上被划分为多个整数个可变维度的块。 块的尺寸以料滴为单位进行测量,每个料滴的固定尺寸为字节。 纹理映射的mipmap级别可以存储在存储器空间中。 所述mipmap级别的纹理坐标可以通过确定纹理坐标所驻留的料滴的料滴地址并确定特定料滴中的字节地址来转换为存储器空间的字节地址。

    Method and system for distributing work batches to processing units based on a number of enabled streaming multiprocessors
    14.
    发明授权
    Method and system for distributing work batches to processing units based on a number of enabled streaming multiprocessors 有权
    基于多个启用的流式多处理器将工作批次分配到处理单元的方法和系统

    公开(公告)号:US09594599B1

    公开(公告)日:2017-03-14

    申请号:US12579143

    申请日:2009-10-14

    摘要: A work distribution unit distributes work batches to general processing clusters (GPCs) based on the number of streaming multiprocessors included in each GPC. Advantageously, each GPC receives an amount of work that is proportional to the amount of processing power afforded by the GPC. Embodiments include a method for distributing batches of processing tasks to two or more general processing clusters (GPCs), including the steps of updating a counter value for each of the two or more GPCs based on the number of enabled parallel processing units within each of the two or more GPCs, and distributing a batch of processing tasks to a first GPC of the two or more GPCs based on a counter value associated with the first GPC and based on a load signal received from the first GPC.

    摘要翻译: 工作分配单元基于每个GPC中包括的流式多处理器的数量将工作批次分发到通用处理集群(GPC)。 有利地,每个GPC接收与由GPC提供的处理能力的量成比例的一定量的工作。 实施例包括用于将批处理任务批处理分配给两个或更多个通用处理集群(GPC)的方法,包括以下步骤:基于每个所述两个或更多个GPC内的所启用的并行处理单元的数目来更新两个或更多个GPC中的每一个的计数器值 两个或更多个GPC,并且基于与第一GPC相关联的计数器值并基于从第一GPC接收的负载信号将一批处理任务分配到两个或更多个GPC的第一GPC。

    Compute work distribution reference counters
    16.
    发明授权
    Compute work distribution reference counters 有权
    计算工作分配参考计数器

    公开(公告)号:US09507638B2

    公开(公告)日:2016-11-29

    申请号:US13291369

    申请日:2011-11-08

    IPC分类号: G06F9/455 G06F9/50

    CPC分类号: G06F9/5022

    摘要: One embodiment of the present invention sets forth a technique for managing the allocation and release of resources during multi-threaded program execution. Programmable reference counters are initialized to values that limit the amount of resources for allocation to tasks that share the same reference counter. Resource parameters are specified for each task to define the amount of resources allocated for consumption by each array of execution threads that is launched to execute the task. The resource parameters also specify the behavior of the array for acquiring and releasing resources. Finally, during execution of each thread in the array, an exit instruction may be configured to override the release of the resources that were allocated to the array. The resources may then be retained for use by a child task that is generated during execution of a thread.

    摘要翻译: 本发明的一个实施例提出了一种用于在多线程程序执行期间管理资源的分配和释放的技术。 可编程参考计数器被初始化为限制用于分配给共享相同引用计数器的任务的资源量的值。 为每个任务指定资源参数,以定义为执行任务启动的每个执行线程数组分配的消耗资源量。 资源参数还指定数组用于获取和释放资源的行为。 最后,在执行阵列中的每个线程时,可以将退出指令配置为覆盖分配给阵列的资源的释放。 然后可以保留资源以供执行线程期间生成的子任务使用。

    Architecture for compact multi-ported register file
    17.
    发明授权
    Architecture for compact multi-ported register file 有权
    体积小巧的多端口寄存器文件

    公开(公告)号:US07490208B1

    公开(公告)日:2009-02-10

    申请号:US10959560

    申请日:2004-10-05

    IPC分类号: G06F13/372 G06F12/00

    CPC分类号: G06F13/372

    摘要: Architecture for compact multi-ported register file is disclosed. In an embodiment, a register file comprises a single-port random access memory (RAM). The single-port RAM comprises a single port for read operations and for write operations. Either a single read or a single write operation is performed for a given clock via the single port. Moreover, the single-port RAM serially performs N read operations and M write operations associated with a data group using a clock phase of (N+M) clock phases generated from a clock. In another embodiment, a semiconductor device includes the architecture for compact multi-ported register file. The semiconductor device comprises a plurality of register files. Each register file comprises a RAM comprising a port for read operations and for write operations. Moreover, each RAM serially performs N read operations and M write operations associated with one of a plurality of data groups using a corresponding clock phase of (N+M) clock phases generated from a clock. Further, the semiconductor device comprises an input staging unit for staging write data of one or more of the write operations. Continuing, the semiconductor device comprises an output staging unit for staging read data of one or more of the read operations. The semiconductor device can be a graphics processing unit (GPU).

    摘要翻译: 公开了用于紧凑型多端口寄存器堆的架构。 在一个实施例中,寄存器文件包括单端口随机存取存储器(RAM)。 单端口RAM包括用于读取操作和写入操作的单个端口。 通过单个端口对给定的时钟执行单个读取或单个写入操作。 此外,单端口RAM使用从时钟产生的(N + M)个时钟相位的时钟相位来串行地执行与数据组相关联的N个读取操作和M个写入操作。 在另一个实施例中,半导体器件包括用于紧凑型多端口寄存器堆的结构。 半导体器件包括多个寄存器文件。 每个寄存器文件包括RAM,其包括用于读操作和写操作的端口。 此外,每个RAM使用从时钟生成的(N + M)个时钟相位的相应时钟相位,串行地执行与多个数据组之一相关联的N个读取操作和M个写入操作。 此外,半导体器件包括用于对一个或多个写入操作的写入数据进行分级的输入分段单元。 继续地,半导体器件包括用于对读取操作中的一个或多个读取数据进行分级的输出分段单元。 半导体器件可以是图形处理单元(GPU)。

    Pixel center position displacement
    19.
    发明授权
    Pixel center position displacement 有权
    像素中心位置位移

    公开(公告)号:US07425966B2

    公开(公告)日:2008-09-16

    申请号:US10960857

    申请日:2004-10-07

    CPC分类号: G06T3/40

    摘要: A pixel center position that is not covered by a primitive covering a portion of the pixel is displaced to lie within a fragment formed by the intersection of the primitive and the pixel. X,y coordinates of a pixel center are adjusted to displace the pixel center position to lie within the fragment, affecting actual texture map coordinates or barycentric weights. Alternatively, a centroid sub-pixel sample position is determined based on coverage data for the pixel and a multisample mode. The centroid sub-pixel sample position is used to compute pixel or sub-pixel parameters for the fragment.

    摘要翻译: 未被覆盖像素的一部分的原图覆盖的像素中心位置被移位以位于由图元和像素的交点形成的片段内。 调整像素中心的X,Y坐标以使像素中心位置位于片段内,影响实际纹理图坐标或重心权重。 或者,基于像素的覆盖数据和多采样模式来确定质心子像素采样位置。 质心子像素采样位置用于计算片段的像素或子像素参数。

    Upstream situated apparatus and method within a computer system for
controlling data flow to a downstream situated input/output unit
    20.
    发明授权
    Upstream situated apparatus and method within a computer system for controlling data flow to a downstream situated input/output unit 失效
    用于控制到下游位置的输入/输出单元的数据流的计算机系统内的上游设备和方法

    公开(公告)号:US6154794A

    公开(公告)日:2000-11-28

    申请号:US716951

    申请日:1996-09-08

    IPC分类号: G06F3/14 G06F13/14 G06F13/20

    CPC分类号: G06F3/14

    摘要: A method and apparatus for controlling the flow of information (e.g., graphics primitives, display data, etc.) to an input/output unit within a computer controlled graphics system. The system includes a processor having a first-in-first-out (FIFO) buffer, a separate input/output unit with its FIFO buffer, and a number of intermediate devices (with FIFO buffers) coupled between the input/output unit and the processor for moving input/output data from the processor to the input/output unit. Mechanisms are placed within an intermediate device, very close to the processor, which maintain an accounting of the number of input/output data sent to the input/output unit, but not yet cleared from the input/output unit's buffer. These mechanisms regulate data flow to the input/output unit. By placing these mechanisms close to the processor, rather than within the input/output unit, the system allows a larger portion of the input/output unit's buffer to be utilized for storing input/output data before a processor suspend or interrupt is required. This leads to increased input/output data throughput between the processor and the input/output unit by reducing processor interrupts. The system also includes an efficiently invoked timer mechanism for temporarily suspending the processor from transmitting stores to the input/output unit when the input/output unit and/or the intermediate devices are congested. The processor is not interrupted by an interrupt request until after the timer mechanism times out, allowing the system an opportunity to clear its congestion before a lengthily invoked interrupt is required.

    摘要翻译: 一种用于控制计算机控制的图形系统内的输入/输出单元的信息流(例如,图形基元,显示数据等)的方法和装置。 该系统包括具有先进先出(FIFO)缓冲器,具有其FIFO缓冲器的单独输入/输出单元和耦合在输入/输出单元与多个FIFO缓冲器之间的多个中间设备(具有FIFO缓冲器) 处理器,用于将输入/输出数据从处理器移动到输入/输出单元。 机构位于非常接近处理器的中间设备内,其维持对输入/输出单元发送的输入/输出数据的数量的记账,但尚未从输入/输出单元的缓冲器中清除。 这些机制调节到输入/输出单元的数据流。 通过将这些机制放置在处理器附近,而不是在输入/输出单元内,系统允许输入/输出单元的缓冲区的较大部分用于在处理器挂起或中断之前存储输入/输出数据。 这导致通过减少处理器中断来增加处理器和输入/输出单元之间的输入/输出数据吞吐量。 当输入/输出单元和/或中间设备拥塞时,系统还包括有效地调用定时器机制,用于暂时将处理器从发送存储发送到输入/输出单元。 在定时器机制超时之后,处理器不会被中断请求中断,从而允许系统在需要长时间调用中断之前清除其拥塞。