Over-evaluating samples during rasterization for improved datapath utilization
    11.
    发明授权
    Over-evaluating samples during rasterization for improved datapath utilization 有权
    在光栅化期间对样本进行过度评估,以提高数据路径利用率

    公开(公告)号:US06924820B2

    公开(公告)日:2005-08-02

    申请号:US09962995

    申请日:2001-09-25

    IPC分类号: G06T11/40 G09G5/00

    CPC分类号: G06T11/40

    摘要: A system and method for rasterizing and rendering graphics data is disclosed. Vertices may be grouped to form primitives such as triangles, which are rasterized using two-dimensional arrays of samples bins. To overcome fragmentation problems, the system's sample evaluation hardware may be configured to over-evaluate samples each clock cycle. Since a number of the samples will typically not survive evaluation because they will be outside the primitive being rendered, the remaining surviving samples may be combined into sets, with one set being forwarded to subsequent pipeline stages each clock cycle in order to attempt to keep the pipeline utilization high.

    摘要翻译: 公开了一种用于光栅化和渲染图形数据的系统和方法。 顶点可以被分组以形成诸如三角形的图元,其使用样本仓的二维阵列进行光栅化。 为了克服分裂问题,系统的样本评估硬件可能被配置为对每个时钟周期的样本进行过度评估。 由于许多样本通常不能存活,因为它们将不在渲染原始图像之外,剩余的存活样本可以组合成集合,其中一个集合被转发到每个时钟周期的后续流水线阶段,以便尝试保持 管道利用率高。

    System and method for prefetching data from a frame buffer
    12.
    发明授权
    System and method for prefetching data from a frame buffer 有权
    从帧缓冲区预取数据的系统和方法

    公开(公告)号:US06812929B2

    公开(公告)日:2004-11-02

    申请号:US10094957

    申请日:2002-03-11

    IPC分类号: G06F1318

    摘要: A graphics system may include a frame buffer that includes several sets of one or more memory banks and a cache. The frame buffer may load data from one of the memory banks into the cache in response to receiving a cache fill request. Each set of memory banks is accessible independently of each other set of memory banks. A frame buffer interface coupled to the frame buffer includes a plurality of cache fill request queues. Each cache fill request queue is configured to store one or more cache fill requests targeting a corresponding one of the sets of memory banks. The frame buffer interface is configured to select a cache fill request from one of the cache fill request queues that stores cache fill requests targeting a set of memory banks that is not currently being accessed and to provide the selected cache fill request to the frame buffer.

    摘要翻译: 图形系统可以包括帧缓冲器,其包括若干组一个或多个存储器组和高速缓存。 响应于接收到高速缓存填充请求,帧缓冲器可以将数据从一个存储体加载到高速缓存中。 每组存储体可以独立于彼此的存储体组来访问。 耦合到帧缓冲器的帧缓冲器接口包括多个高速缓存填充请求队列。 每个高速缓存填充请求队列被配置为存储一个或多个缓存填充请求,其针对存储器组的相应组之一。 帧缓冲器接口被配置为从缓存填充请求队列中的一个选择高速缓存填充请求,所述缓存填充请求队列存储针对当前未被访问的一组存储器组的高速缓存填充请求,并且向帧缓冲器提供所选择的高速缓存填充请求。

    Parallel read with source-clear operation
    13.
    发明授权
    Parallel read with source-clear operation 有权
    并行读取与源清除操作

    公开(公告)号:US06795078B2

    公开(公告)日:2004-09-21

    申请号:US10066397

    申请日:2002-01-31

    IPC分类号: G06G1318

    摘要: A memory interface controls read and write accesses to a memory device. The memory device includes a level-one cache, level-two cache and storage cell array. The memory interface includes a data request processor (DRP), a memory control processor (MCP) and a block cleansing unit (BCU). The MCP controls transfers between the storage cell array, the level-two cache and the level-one cache. In response to a read request with associated read clear indication, the DRP controls a read from a level-one cache block, updates bits in a corresponding dirty tag, and sets a mode indicator of the dirty tag to a the read clear mode. The modified dirty tag bits and mode indicator are signals to the BCU that the level-one cache block requires a source clear operation. The BCU commands the transfer of data from a color fill block in the level-one cache to the level-two cache.

    摘要翻译: 存储器接口控制对存储器件的读写访问。 存储器件包括一级缓存,二级缓存和存储单元阵列。 存储器接口包括数据请求处理器(DRP),存储器控制处理器(MCP)和块清理单元(BCU)。 MCP控制存储单元阵列,二级缓存和一级缓存之间的传输。 响应于具有关联的读取清除指示的读取请求,DRP控制从一级缓存块的读取,更新相应脏标签中的比特,并将脏标签的模式指示符设置为读取清除模式。 修改的脏标签位和模式指示符是到BCU的信号,一级缓存块需要源清除操作。 BCU命令将数据从一级缓存中的颜色填充块传送到二级缓存。

    Stalling pipelines in large designs
    14.
    发明授权
    Stalling pipelines in large designs 有权
    大型设计中的管道不畅

    公开(公告)号:US06885375B2

    公开(公告)日:2005-04-26

    申请号:US10095308

    申请日:2002-03-11

    IPC分类号: G06T1/20

    CPC分类号: G06T1/20

    摘要: A method and a system for stalling large pipelined designs. A computational pipeline may comprise a first module and a second module coupled together. The first module may propagate one or more signals to the second module. A stall-signal may be asserted in order to stall the computational pipeline if the second module is not ready to receive the one or more signals from the first module. The one or more signals propagated from the first module and the asserted stall-signal may be buffered in a stall-buffer. The asserted stall-signal may be propagated to the first module in a next cycle. The first module may be stalled in response to the first module receiving the propagated asserted stall-signal. Next, the asserted stall-signal may be propagated up the computational pipeline.

    摘要翻译: 一种阻止大流水线设计的方法和系统。 计算流水线可以包括耦合在一起的第一模块和第二模块。 第一模块可以将一个或多个信号传播到第二模块。 如果第二模块未准备好接收来自第一模块的一个或多个信号,则可以断言失速信号以便停止计算流水线。 从第一模块传播的一个或多个信号和被断言的失速信号可以缓冲在停顿缓冲器中。 在下一个周期中,断言的失速信号可以传播到第一个模块。 第一模块可以响应于第一模块接收传播的断言失速信号而停止。 接下来,所断言的失速信号可以在计算流水线上传播。

    Reading a selected register in a series of computational units forming a processing pipeline upon expiration of a time delay
    15.
    发明授权
    Reading a selected register in a series of computational units forming a processing pipeline upon expiration of a time delay 有权
    在一段时间延迟结束时,以一系列计算单元读取形成处理流水线的选定寄存器

    公开(公告)号:US06842851B2

    公开(公告)日:2005-01-11

    申请号:US10085642

    申请日:2002-02-28

    摘要: A system and method for reading register contents from a computational pipeline having a plurality of computational units. The system includes a readback bus and a read control unit. The readback bus has a plurality of logic units coupled in a series. Each logic unit couples to a corresponding one of the computational units. The read control unit couples to each of the computational units through a corresponding load line, and is configured to assert a load signal on one of the load lines in response to a register read request. Each of the computational units is configured to transmit a data value from a selected register onto the readback bus in response to detecting an assertion of the load signal on its corresponding load line.

    摘要翻译: 一种用于从具有多个计算单元的计算流水线读取寄存器内容的系统和方法。 该系统包括回读总线和读取控制单元。 回读总线具有串联耦合的多个逻辑单元。 每个逻辑单元耦合到相应的一个计算单元。 读取控制单元通过对应的负载线耦合到每个计算单元,并且被配置为响应于寄存器读取请求而在其中一个负载线上断言负载信号。 每个计算单元被配置为响应于检测到其相应负载线上的负载信号的断言而将数据值从所选择的寄存器发送到回读总线。

    System and method for controlling a number of outstanding data transactions within an integrated circuit
    16.
    发明授权
    System and method for controlling a number of outstanding data transactions within an integrated circuit 有权
    用于控制集成电路内的许多未完成数据事务的系统和方法

    公开(公告)号:US06731292B2

    公开(公告)日:2004-05-04

    申请号:US10092016

    申请日:2002-03-06

    IPC分类号: G06F1576

    CPC分类号: G09G5/395 G06F13/405 G06T1/20

    摘要: An integrated circuit may include several components, one or more interfaces, an interconnect (e.g., a bus), and a controller. The components may each be configured to assert a read request to read data stored externally to the integrated circuit. The interfaces may be configured to output the read request asserted by one of the components and to receive data in response to outputting the request. The interconnect may be coupled to perform one or more data transactions to transmit the data from one of the interfaces to one or more of the components. In response to the read request asserted by one of the components, the controller may inhibit performance of a read transaction initiated by the read request dependent upon a comparison of a total number of outstanding data transactions to a maximum allowable number of outstanding data transactions.

    摘要翻译: 集成电路可以包括若干组件,一个或多个接口,互连(例如,总线)和控制器。 每个组件可以被配置为断言读取请求以读取外部存储到集成电路的数据。 接口可以被配置为输出由其中一个组件确定的读取请求并响应于输出请求而接收数据。 互连可以被耦合以执行一个或多个数据事务以将数据从一个接口传送到一个或多个组件。 响应于由其中一个组件所声明的读取请求,控制器可以根据未完成数据事务的总数与最大允许数量的未完成数据事务的比较来禁止由读取请求发起的读取事务的执行。

    Early primitive assembly and screen-space culling for multiple chip graphics system
    17.
    发明授权
    Early primitive assembly and screen-space culling for multiple chip graphics system 有权
    早期的原始装配和屏幕空间剔除多芯片图形系统

    公开(公告)号:US06943797B2

    公开(公告)日:2005-09-13

    申请号:US10611271

    申请日:2003-06-30

    IPC分类号: G06T1/20 G06T1/60 G06T15/00

    摘要: A multi-chip system and method are disclosed for incorporating a primitive assembler in each of one or more geometry chips and one or more rasterization chips. This system may allow per-primitive operations to be performed in the geometry chips, and also allow use of a vertex data interface for sending vertex data to the rasterization chips. The primitive assemblers in the geometry chips may assemble vertices into primitives for clipping tests. The geometry chips may also test an assembled primitive against the projected boundaries of a set of screen space regions, where each region is assigned to one of the rasterization chips. Those primitives residing in more than one region may be sub-divided into two or more new primitives so that each new primitive resides in only one screen space region. The geometry chip may then send the vertex data for each primitive to the corresponding rasterization chip.

    摘要翻译: 公开了一种用于将原始汇编器并入一个或多个几何码片和一个或多个光栅化码片的每一个中的多芯片系统和方法。 该系统可以允许在几何芯片中执行每个原始操作,并且还允许使用顶点数据接口将顶点数据发送到光栅化芯片。 几何芯片中的原始汇编器可以将顶点组装成用于剪切测试的基元。 几何芯片还可以针对一组屏幕空间区域的投影边界来测试组合的图元,其中每个区域被分配给光栅化芯片中的一个。 驻留在多个区域中的这些原语可以被细分为两个或更多个新的基元,使得每个新的基元仅驻留在一个屏幕空间区域。 然后,几何芯片可以将每个基元的顶点数据发送到相应的光栅化芯片。

    Using observability logic for real-time debugging of ASICs
    18.
    发明授权
    Using observability logic for real-time debugging of ASICs 有权
    使用可观察性逻辑来实现ASIC的实时调试

    公开(公告)号:US06781406B2

    公开(公告)日:2004-08-24

    申请号:US10090481

    申请日:2002-03-04

    IPC分类号: H03K19173

    摘要: An integrated circuit including logic for testing internal operation of the integrated circuit. The integrated circuit may comprise a plurality of internal functional blocks coupled by a plurality of internal buses. The integrated circuit may also comprise a set of test control input pins and a set of test output pins comprised on the integrated circuit. The integrated circuit may comprise selection logic. The selection logic comprises inputs coupled to various ones of the internal buses, an output coupled to the set of test output pins, and a select input coupled to receive select signals from the set of test control input pins. The selection logic is operable to select internal bus signals from an internal bus based on the select signals from the test control input pins, and the selection logic is configured to output the selected internal bus signals to the set of test output pins. The integrated circuit thus allows multiplexing of different critical internal buses so that the signals on the critical buses may be output for observation via selected test pins on the integrated circuit. The observability logic may be configured to switch slowly relative to the internal busses, and the generation of the observability logic and testing may be automated.

    摘要翻译: 一种集成电路,包括用于测试集成电路的内部操作的逻辑。 集成电路可以包括通过多个内部总线耦合的多个内部功能块。 集成电路还可以包括一组测试控制输入引脚和一组包含在集成电路上的测试输出引脚。 集成电路可以包括选择逻辑。 选择逻辑包括耦合到各种内部总线的输入,耦合到该组测试输出引脚的输出以及耦合以从该组测试控制输入引脚接收选择信号的选择输入。 选择逻辑可操作以基于来自测试控制输入引脚的选择信号从内部总线选择内部总线信号,并且选择逻辑被配置为将所选择的内部总线信号输出到测试输出引脚组。 因此,集成电路允许复用不同的关键内部总线,使得可以输出关键总线上的信号,以便通过集成电路上的选定测试引脚进行观察。 可观测性逻辑可以被配置为相对于内部总线缓慢地切换,并且可观察性逻辑和测试的产生可以是自动化的。

    Program sequencer for generating indeterminant length shader programs for a graphics processor
    19.
    发明授权
    Program sequencer for generating indeterminant length shader programs for a graphics processor 有权
    用于为图形处理器生成不确定长度着色器程序的程序定序器

    公开(公告)号:US08659601B1

    公开(公告)日:2014-02-25

    申请号:US11893404

    申请日:2007-08-15

    摘要: A method for loading and executing an indeterminate length shader program. The method includes accessing a first portion of a shader program in graphics memory of a GPU and loading instructions from the first portion into a plurality of stages of the GPU to configure the GPU for program execution. A group of pixels is then processed in accordance with the instructions from the first portion. A second portion of the shader program is accessed in graphics memory of the GPU and instructions from the second portion are loaded into the plurality of stages of the GPU to configure the GPU for program execution. The group of pixels are then processed in accordance with the instructions from the second portion.

    摘要翻译: 一种用于加载和执行不确定长度着色器程序的方法。 该方法包括访问GPU的图形存储器中的着色器程序的第一部分,并且将指令从第一部分加载到GPU的多个阶段以配置GPU用于程序执行。 然后根据来自第一部分的指令对一组像素进行处理。 在GPU的图形存储器中访问着色器程序的第二部分,并且来自第二部分的指令被加载到GPU的多个级中以配置GPU用于程序执行。 然后根据来自第二部分的指令对像素组进行处理。