DMA-based acceleration of command push buffer between host and target devices
    1.
    发明授权
    DMA-based acceleration of command push buffer between host and target devices 失效
    主机和目标设备之间基于DMA的加速命令推送缓冲区

    公开(公告)号:US08719455B2

    公开(公告)日:2014-05-06

    申请号:US12824674

    申请日:2010-06-28

    IPC分类号: G06F3/00 G06F13/28

    CPC分类号: G06F13/28

    摘要: Direct Memory Access (DMA) is used in connection with passing commands between a host device and a target device coupled via a push buffer. Commands passed to a push buffer by a host device may be accumulated by the host device prior to forwarding the commands to the push buffer, such that DMA may be used to collectively pass a block of commands to the push buffer. In addition, a host device may utilize DMA to pass command parameters for commands to a command buffer that is accessible by the target device but is separate from the push buffer, with the commands that are passed to the push buffer including pointers to the associated command parameters in the command buffer.

    摘要翻译: 直接存储器访问(DMA)用于在通过推送缓冲器耦合的主机设备和目标设备之间传递命令。 由宿主设备传递到推送缓冲器的命令可以在将命令转发到推送缓冲器之前被主机设备累积,使得可以使用DMA来共同地将一组命令传递给推送缓冲器。 此外,主机设备可以利用DMA将用于命令的命令参数传递给目标设备可访问但与推送缓冲区分离的命令缓冲区,其中传递到推送缓冲器的命令包括指向相关命令的指针 命令缓冲区中的参数。

    Parallelized streaming accelerated data structure generation
    2.
    发明授权
    Parallelized streaming accelerated data structure generation 失效
    并行流加速数据结构生成

    公开(公告)号:US08692825B2

    公开(公告)日:2014-04-08

    申请号:US12822427

    申请日:2010-06-24

    IPC分类号: G09G5/00

    摘要: A method includes receiving at a master processing element primitive data that includes properties of a primitive. The method includes partially traversing a spatial data structure that represents a three-dimensional image to identify an internal node of the spatial data structure. The internal node represents a portion of the three-dimensional image. The method also includes selecting a slave processing element from a plurality of slave processing elements. The selected processing element is associated with the internal node. The method further includes sending the primitive data to the selected slave processing element to traverse a portion of the spatial data structure to identify a leaf node of the spatial data structure.

    摘要翻译: 一种方法包括在主处理元件处接收包括原语的属性的原始数据。 该方法包括部分地遍历表示三维图像以识别空间数据结构的内部节点的空间数据结构。 内部节点表示三维图像的一部分。 该方法还包括从多个从属处理元件中选择从属处理元件。 所选择的处理元件与内部节点相关联。 该方法还包括将原始数据发送到所选择的从属处理元件以遍历空间数据结构的一部分以识别空间数据结构的叶节点。

    VECTOR REGISTER FILE CACHING OF CONTEXT DATA STRUCTURE FOR MAINTAINING STATE DATA IN A MULTITHREADED IMAGE PROCESSING PIPELINE
    4.
    发明申请
    VECTOR REGISTER FILE CACHING OF CONTEXT DATA STRUCTURE FOR MAINTAINING STATE DATA IN A MULTITHREADED IMAGE PROCESSING PIPELINE 有权
    用于维护多图像处理管道中状态数据的上下文数据结构的矢量寄存器文件

    公开(公告)号:US20130044117A1

    公开(公告)日:2013-02-21

    申请号:US13212418

    申请日:2011-08-18

    IPC分类号: G06T1/20 G06F9/02 G06F15/76

    摘要: Frequently accessed state data used in a multithreaded graphics processing architecture is cached within a vector register file of a processing unit to optimize accesses to the state data and minimize memory bus utilization associated therewith. A processing unit may include a fixed point execution unit as well as a vector floating point execution unit, and a vector register file utilized by the vector floating point execution unit may be used to cache state data used by the fixed point execution unit and transferred as needed into the general purpose registers accessible by the fixed point execution unit, thereby reducing the need to repeatedly retrieve and write back the state data from and to an L1 or lower level cache accessed by the fixed point execution unit.

    摘要翻译: 在多线程图形处理架构中使用的经常访问的状态数据被缓存在处理单元的向量寄存器文件中,以优化对状态数据的访问并最小化与其相关联的存储器总线利用。 处理单元可以包括固定点执行单元以及向量浮点执行单元,并且向量浮点执行单元使用的向量寄存器文件可用于对由固定点执行单元使用的状态数据进行缓存并转移为 需要进入由固定点执行单元访问的通用寄存器,从而减少了从固定点执行单元访问的L1或更低级高速缓存重复检索和回写状态数据的需要。

    REUSE OF STATIC IMAGE DATA FROM PRIOR IMAGE FRAMES TO REDUCE RASTERIZATION REQUIREMENTS
    5.
    发明申请
    REUSE OF STATIC IMAGE DATA FROM PRIOR IMAGE FRAMES TO REDUCE RASTERIZATION REQUIREMENTS 失效
    从先前的图像框架中减少静态图像数据以减少放射性要求

    公开(公告)号:US20120176364A1

    公开(公告)日:2012-07-12

    申请号:US12985607

    申请日:2011-01-06

    IPC分类号: G06T15/00

    摘要: An apparatus, program product and method reuse static image data generated during rasterization of static geometry to reduce the processing overhead associated with rasterizing subsequent image frames. In particular, static image data generated one frame may be reused in a subsequent image frame such that the subsequent image frame is generated without having to re-rasterize the static geometry from the scene, i.e., with only the dynamic geometry rasterized. The resulting image frame includes dynamic image data generated as a result of rasterizing the dynamic geometry during that image frame, and static image data generated as a result of rasterizing the static image data during a prior image frame.

    摘要翻译: 一种装置,程序产品和方法重用在静态几何的光栅化期间产生的静态图像数据,以减少与后续图像帧的光栅化相关联的处理开销。 特别地,生成一帧的静态图像数据可以在随后的图像帧中重新使用,使得生成后续图像帧,而不必从场景重新光栅化静态几何,即仅光栅化动态几何。 所得到的图像帧包括作为在该图像帧期间光栅化动态几何结果而生成的动态图像数据,以及作为在先前图像帧期间对静态图像数据进行光栅化而产生的静态图像数据。

    Performance Event Triggering Through Direct Interthread Communication On a Network On Chip
    6.
    发明申请
    Performance Event Triggering Through Direct Interthread Communication On a Network On Chip 失效
    通过芯片上网络直接通信的性能事件触发

    公开(公告)号:US20100269123A1

    公开(公告)日:2010-10-21

    申请号:US12427090

    申请日:2009-04-21

    IPC分类号: G06F9/54

    CPC分类号: H04L43/0817

    摘要: Performance event triggering through direct interthread communication (‘DITC’) on a network on chip (‘NOC’), the NOC including integrated processor (‘IP’) blocks, routers, memory communications controllers, and network interface controllers, with each IP block adapted to a router through a memory communications controller and a network interface controller, where each memory communications controller controlling communications between an IP block and memory, and each network interface controller controlling inter-IP block communications through routers, including enabling performance event monitoring in a selected set of IP blocks distributed throughout the NOC, each IP block within the selected set of IP blocks having one or more event counters; collecting performance results from the one or more event counters; and returning performance results from the one or more event counters to a destination repository, the returning being initiated by a triggering event occurring within the NOC.

    摘要翻译: 通过芯片上的直接线间通信(“DITC”)触发的性能事件,NOC包括集成处理器(“IP”)块,路由器,存储器通信控制器和网络接口控制器,每个IP块 通过存储器通信控制器和网络接口控制器适配于路由器,其中每个存储器通信控制器控制IP块和存储器之间的通信,以及控制通过路由器进行IP间块通信的每个网络接口控制器,包括在 分配在整个NOC上的所选择的一组IP块,所选择的一组IP块中的每个IP块具有一个或多个事件计数器; 从一个或多个事件计数器收集性能结果; 并将性能结果从一个或多个事件计数器返回到目的地存储库,返回由在NOC内发生的触发事件发起。

    Vector register file caching of context data structure for maintaining state data in a multithreaded image processing pipeline
    7.
    发明授权
    Vector register file caching of context data structure for maintaining state data in a multithreaded image processing pipeline 有权
    用于在多线程图像处理管道中维护状态数据的上下文数据结构的向量寄存器文件缓存

    公开(公告)号:US08836709B2

    公开(公告)日:2014-09-16

    申请号:US13212418

    申请日:2011-08-18

    摘要: Frequently accessed state data used in a multithreaded graphics processing architecture is cached within a vector register file of a processing unit to optimize accesses to the state data and minimize memory bus utilization associated therewith. A processing unit may include a fixed point execution unit as well as a vector floating point execution unit, and a vector register file utilized by the vector floating point execution unit may be used to cache state data used by the fixed point execution unit and transferred as needed into the general purpose registers accessible by the fixed point execution unit, thereby reducing the need to repeatedly retrieve and write back the state data from and to an L1 or lower level cache accessed by the fixed point execution unit.

    摘要翻译: 在多线程图形处理架构中使用的经常访问的状态数据被缓存在处理单元的向量寄存器文件中,以优化对状态数据的访问并最小化与其相关联的存储器总线利用。 处理单元可以包括固定点执行单元以及向量浮点执行单元,并且向量浮点执行单元使用的向量寄存器文件可用于对由固定点执行单元使用的状态数据进行缓存并转移为 需要进入由固定点执行单元访问的通用寄存器,从而减少了从固定点执行单元访问的L1或更低级高速缓存重复检索和回写状态数据的需要。

    Multithreaded physics engine with impulse propagation
    9.
    发明授权
    Multithreaded physics engine with impulse propagation 失效
    具脉冲传播的多线程物理引擎

    公开(公告)号:US08413166B2

    公开(公告)日:2013-04-02

    申请号:US13212403

    申请日:2011-08-18

    IPC分类号: G06F3/00

    摘要: A circuit arrangement and method implement impulse propagation in a multithreaded physics engine by assigning ownership of objects in a scene to individual threads and propagating impulses between objects that are in contact with one another by passing inter-thread impulse messages between the threads that own the contacting objects, while locally propagating impulses through objects using the threads to which such objects are assigned.

    摘要翻译: 电路布置和方法通过将场景中的对象的所有权分配给单独的线程并且在彼此接触的对象之间传播脉冲来实现多线程物理引擎中的脉冲传播,所述对象通过在拥有所述接触的线程之间传递线间脉冲消息 对象,同时通过使用分配了这些对象的线程的对象来本地传播脉冲。

    MULTITHREADED PHYSICS ENGINE WITH PREDICTIVE LOAD BALANCING
    10.
    发明申请
    MULTITHREADED PHYSICS ENGINE WITH PREDICTIVE LOAD BALANCING 失效
    多重物理发动机具有预测负载平衡

    公开(公告)号:US20110321057A1

    公开(公告)日:2011-12-29

    申请号:US12822615

    申请日:2010-06-24

    IPC分类号: G06F9/46

    摘要: A circuit arrangement and method utilize predictive load balancing to allocate the workload among hardware threads in a multithreaded physics engine. The predictive load balancing is based at least in part upon the detection of predicted future collisions between objects in a scene, such that the reallocation of respective loads of a plurality of hardware threads may be initiated prior to detection of the actual collisions, thereby increasing the likelihood that hardware threads will be optimally allocated when the actual collisions occur.

    摘要翻译: 电路布置和方法利用预测负载平衡来在多线程物理引擎中的硬件线程之间分配工作量。 预测性负载平衡至少部分地基于对场景中对象之间预测的未来碰撞的检测,使得可以在检测到实际冲突之前启动多个硬件线程的相应负载的重新分配,从而增加 在发生实际冲突时硬件线程将被最佳分配的可能性。