Sequential processing in network on chip nodes by threads generating message containing payload and pointer for nanokernel to access algorithm to be executed on payload in another node
    71.
    发明授权
    Sequential processing in network on chip nodes by threads generating message containing payload and pointer for nanokernel to access algorithm to be executed on payload in another node 有权
    通过线程生成包含有效载荷和纳米核的指针的线程在芯片节点上的顺序处理来访问在另一个节点的有效载荷上执行的算法

    公开(公告)号:US08423749B2

    公开(公告)日:2013-04-16

    申请号:US12255827

    申请日:2008-10-22

    IPC分类号: G06F9/40

    CPC分类号: G06F9/54 H04L45/58

    摘要: A computer-implemented method, system and computer program product for controlling an algorithm that is performed on a unit of work in a subsequent software pipeline stage in a Network On a Chip (NOC) is presented. In one embodiment, the method executes a first operation in a first node of the NOC. The first node generates payload, and then loads that payload into a message. The message with the payload is transmitted to a nanokernel that controls a second node in the NOC. The nanokernel calls an algorithm that is needed by a second operation in a second node in the NOC, which uses the algorithm to execute the second operation.

    摘要翻译: 提出了一种用于控制在片上网络(NOC)中的后续软件流水线阶段中的工作单元上执行的算法的计算机实现的方法,系统和计算机程序产品。 在一个实施例中,该方法在NOC的第一节点中执行第一操作。 第一个节点生成有效负载,然后将该负载加载到消息中。 具有有效载荷的消息被传送到控制NOC中的第二节点的纳米内核。 nanokernel调用NOC中的第二个节点中第二个操作需要的算法,该算法使用该算法来执行第二个操作。

    Rolling texture context data structure for maintaining texture data in a multithreaded image processing pipeline
    72.
    发明授权
    Rolling texture context data structure for maintaining texture data in a multithreaded image processing pipeline 失效
    滚动纹理上下文数据结构,用于在多线程图像处理流水线中维护纹理数据

    公开(公告)号:US08405670B2

    公开(公告)日:2013-03-26

    申请号:US12787110

    申请日:2010-05-25

    CPC分类号: G06T1/20

    摘要: A multithreaded rendering software pipeline architecture utilizes a rolling texture context data structure to store multiple texture contexts that are associated with different textures that are being processed in the software pipeline. Each texture context stores state data for a particular texture, and facilitates the access to texture data by multiple, parallel stages in a software pipeline. In addition, texture contexts are capable of being “rolled”, or copied to enable different stages of a rendering pipeline that require different state data for a particular texture to separately access the texture data independently from one another, and without the necessity for stalling the pipeline to ensure synchronization of shared texture data among the stages of the pipeline.

    摘要翻译: 多线程渲染软件流水线架构使用滚动纹理上下文数据结构来存储与在软件管线中处理的不同纹理相关联的多个纹理上下文。 每个纹理上下文存储特定纹理的状态数据,并且便于通过软件流水线中的多个并行级访问纹理数据。 此外,纹理上下文能够被滚动或复制以使得需要用于特定纹理的不同状态数据的渲染流水线的不同阶段来独立地彼此独立地访问纹理数据,并且不需要使流水线停止 确保在管道的阶段之间共享纹理数据的同步。

    Software Pipelining On A Network On Chip
    73.
    发明申请
    Software Pipelining On A Network On Chip 有权
    网络芯片上的软件流水线

    公开(公告)号:US20120209944A1

    公开(公告)日:2012-08-16

    申请号:US13453380

    申请日:2012-04-23

    IPC分类号: G06F15/167

    摘要: Memory sharing in a software pipeline on a network on chip (‘NOC’), the NOC including integrated processor (‘IP’) blocks, routers, memory communications controllers, and network interface controllers, with each IP block adapted to a router through a memory communications controller and a network interface controller, where each memory communications controller controlling communications between an IP block and memory, and each network interface controller controlling inter-IP block communications through routers, including segmenting a computer software application into stages of a software pipeline, the software pipeline comprising one or more paths of execution; allocating memory to be shared among at least two stages including creating a smart pointer, the smart pointer including data elements for determining when the shared memory can be deallocated; determining, in dependence upon the data elements for determining when the shared memory can be deallocated, that the shared memory can be deallocated; and deallocating the shared memory.

    摘要翻译: 在芯片上的软件管道(“NOC”)中的内存共享,NOC包括集成处理器(IP)块,路由器,存储器通信控制器和网络接口控制器,每个IP块通过 存储器通信控制器和网络接口控制器,其中每个存储器通信控制器控制IP块和存储器之间的通信以及控制通过路由器进行IP间块通信的每个网络接口控制器,包括将计算机软件应用程序分解成软件流水线的阶段, 所述软件流水线包括一个或多个执行路径; 在至少两个阶段中分配要共享的存储器,包括创建智能指针,所述智能指针包括用于确定何时可以释放所述共享存储器的数据元素; 根据用于确定何时可以释放共享存储器的数据元素确定可以释放共享存储器; 并释放共享内存。

    INSTRUCTION UNIT WITH INSTRUCTION BUFFER PIPELINE BYPASS
    74.
    发明申请
    INSTRUCTION UNIT WITH INSTRUCTION BUFFER PIPELINE BYPASS 有权
    指令单元与指令缓冲管道旁路

    公开(公告)号:US20110320771A1

    公开(公告)日:2011-12-29

    申请号:US12824812

    申请日:2010-06-28

    IPC分类号: G06F9/38 G06F9/312 G06F12/08

    摘要: A circuit arrangement and method selectively bypass an instruction buffer for selected instructions so that bypassed instructions can be dispatched without having to first pass through the instruction buffer. Thus, for example, in the case that an instruction buffer is partially or completely flushed as a result of an instruction redirect (e.g., due to a branch mispredict), instructions can be forwarded to subsequent stages in an instruction unit and/or to one or more execution units without the latency associated with passing through the instruction buffer.

    摘要翻译: 电路装置和方法选择性地旁路用于所选指令的指令缓冲器,使得可以调度旁路指令而不必首先通过指令缓冲器。 因此,例如,在指令重定向(例如,由于分支错误预测)导致指令缓冲器被部分或全部冲洗的情况下,可以将指令转发到指令单元中的后续阶段和/或向一个 或更多的执行单元,而没有与通过指令缓冲器相关联的延迟。

    DMA-BASED ACCELERATION OF COMMAND PUSH BUFFER BETWEEN HOST AND TARGET DEVICES
    75.
    发明申请
    DMA-BASED ACCELERATION OF COMMAND PUSH BUFFER BETWEEN HOST AND TARGET DEVICES 失效
    基于DMA的主机和目标设备之间的命令推挽缓存的加速

    公开(公告)号:US20110320724A1

    公开(公告)日:2011-12-29

    申请号:US12824674

    申请日:2010-06-28

    IPC分类号: G06F12/08

    CPC分类号: G06F13/28

    摘要: Direct Memory Access (DMA) is used in connection with passing commands between a host device and a target device coupled via a push buffer. Commands passed to a push buffer by a host device may be accumulated by the host device prior to forwarding the commands to the push buffer, such that DMA may be used to collectively pass a block of commands to the push buffer. In addition, a host device may utilize DMA to pass command parameters for commands to a command buffer that is accessible by the target device but is separate from the push buffer, with the commands that are passed to the push buffer including pointers to the associated command parameters in the command buffer.

    摘要翻译: 直接存储器访问(DMA)用于在通过推送缓冲器耦合的主机设备和目标设备之间传递命令。 由宿主设备传递到推送缓冲器的命令可以在将命令转发到推送缓冲器之前被主机设备累积,使得可以使用DMA来共同地将一组命令传递给推送缓冲器。 此外,主机设备可以利用DMA将用于命令的命令参数传递给目标设备可访问但与推送缓冲区分离的命令缓冲区,其中传递到推送缓冲器的命令包括指向相关命令的指针 命令缓冲区中的参数。

    Recovering Data From A Plurality of Packets
    76.
    发明申请
    Recovering Data From A Plurality of Packets 失效
    从多个数据包恢复数据

    公开(公告)号:US20110317712A1

    公开(公告)日:2011-12-29

    申请号:US12823689

    申请日:2010-06-25

    IPC分类号: H04L12/56 G06F11/00

    CPC分类号: H04L49/9047

    摘要: A method includes receiving a plurality of packets at an integrated processor block of a network on a chip device. The plurality of packets includes a first packet that includes an indication of a start of data associated with a pixel shader application. The method includes recovering the data from the plurality of packets. The method also includes storing the recovered data in a dedicated packet collection memory within the network on the chip device. The method further includes retaining the data stored in the dedicated packet collection memory during an interruption event. Upon completion of the interruption event, the method includes copying packets stored in the dedicated packet collection memory prior to the interruption event to an inbox of the network on the chip device for processing.

    摘要翻译: 一种方法包括在芯片设备上的网络的集成处理器块处接收多个分组。 多个分组包括包括与像素着色器应用相关联的数据开始的指示的第一分组。 该方法包括从多个分组中恢复数据。 该方法还包括将恢复的数据存储在芯片设备上的网络内的专用分组收集存储器中。 该方法还包括在中断事件期间保留存储在专用分组收集存储器中的数据。 在中断事件完成时,该方法包括在中断事件之前将存储在专用分组收集存储器中的分组复制到芯片装置上的网络的收件箱进行处理。

    Software Trace Collection and Analysis Utilizing Direct Interthread Communication On A Network On Chip
    77.
    发明申请
    Software Trace Collection and Analysis Utilizing Direct Interthread Communication On A Network On Chip 审中-公开
    软件跟踪收集和分析利用网络上的直接间接通信

    公开(公告)号:US20110289485A1

    公开(公告)日:2011-11-24

    申请号:US12784533

    申请日:2010-05-21

    IPC分类号: G06F9/45 G06F9/44

    CPC分类号: G06F11/3636

    摘要: Collecting and analyzing trace data while in a software debug mode through direct interthread communication (‘DITC’) on a network on chip (‘NOC’), the NOC including integrated processor (‘IP’) blocks, routers, memory communications controllers, and network interface controllers, with each IP block adapted to a router through a memory communications controller and a network interface controller, where each memory communications controller controlling communications between an IP block and memory, and each network interface controller controlling inter-IP block communications through routers, including enabling the collection of software debug information in a selected set of IP blocks distributed through the NOC, each IP block within the selected set of IP blocks having a set of trace data; collecting software debugging information via the set of trace data; communicating the set of trace data to a destination repository; and analyzing the set of trace data at the destination repository.

    摘要翻译: 在软件调试模式下,通过芯片上的直接通讯(DITC)(“NOC”)收集和分析跟踪数据,NOC包括集成处理器(IP)块,路由器,存储器通信控制器和 网络接口控制器,每个IP块通过存储器通信控制器和网络接口控制器适应于路由器,其中每个存储器通信控制器控制IP块和存储器之间的通信,以及每个网络接口控制器控制通过路由器进行IP间块通信 包括能够在通过NOC分配的所选择的一组IP块中收集软件调试信息,所选择的一组IP块中的每个IP块具有一组跟踪数据; 通过一组跟踪数据收集软件调试信息; 将该组跟踪数据传送到目的地存储库; 并分析目标存储库中的一组跟踪数据。

    Software debugger for packets in a network on a chip
    78.
    发明授权
    Software debugger for packets in a network on a chip 失效
    软件调试器,用于芯片上网络中的数据包

    公开(公告)号:US07992043B2

    公开(公告)日:2011-08-02

    申请号:US12255837

    申请日:2008-10-22

    IPC分类号: G06F11/00

    CPC分类号: G06F11/362

    摘要: A breakpoint packet is dispatched to a Network On A Chip (NOC). The breakpoint packet instructs one or more specified nodes on the NOC to place the specified nodes, or a core or hardware thread within a specified node, to execute in “single step” mode, in order to enable a debugging of a work packet that is dispatched to the specific node.

    摘要翻译: 断点数据包被分派到片上网络(NOC)。 断点包指示NOC上的一个或多个指定节点将指定的节点或指定节点内的核心或硬件线程放置在“单步”模式下执行,以便能够调试工作包 调度到特定节点。

    Physical Rendering With Textured Bounding Volume Primitive Mapping
    79.
    发明申请
    Physical Rendering With Textured Bounding Volume Primitive Mapping 失效
    物理渲染与纹理边界体原子映射

    公开(公告)号:US20100238169A1

    公开(公告)日:2010-09-23

    申请号:US12407398

    申请日:2009-03-19

    IPC分类号: G06T15/40

    CPC分类号: G06T15/06 G06T15/40

    摘要: A circuit arrangement, program product and circuit arrangement utilize a textured bounding volume to reduce the overhead associated with generating and using an Accelerated Data Structure (ADS) in connection with physical rendering. In particular, a subset of the primitives in a scene may be mapped to surfaces of a bounding volume to generate textures on such surfaces that can be used during physical rendering. By doing so, the primitives that are mapped to the bounding volume surfaces may be omitted from the ADS to reduce the processing overhead associated with both generating the ADS and using the ADS during physical rendering, and furthermore, in many instances the size of the ADS may be reduced, thus reducing the memory footprint of the ADS, and often improving cache hit rates and reducing memory bandwidth.

    摘要翻译: 电路布置,程序产品和电路布置利用纹理边界体积来减少与生成和使用结合物理渲染的加速数据结构(ADS)相关联的开销。 特别地,场景中的图元的子集可被映射到边界体积的表面,以在物理渲染期间使用的这些表面上生成纹理。 通过这样做,可以从ADS中省略映射到边界体积表面的原语,以减少在物理渲染期间生成ADS和使用ADS相关联的处理开销,此外,在许多情况下,ADS的大小 可以减少,从而减少ADS的内存占用,并且经常提高缓存命中率并减少内存带宽。

    SINGLE STEP MODE IN A SOFTWARE PIPELINE WITHIN A HIGHLY THREADED NETWORK ON A CHIP MICROPROCESSOR
    80.
    发明申请
    SINGLE STEP MODE IN A SOFTWARE PIPELINE WITHIN A HIGHLY THREADED NETWORK ON A CHIP MICROPROCESSOR 失效
    芯片微处理器中的高度线性化网络中的软件管道中的单步模式

    公开(公告)号:US20100191940A1

    公开(公告)日:2010-07-29

    申请号:US12358776

    申请日:2009-01-23

    IPC分类号: G06F9/32

    摘要: A hardware thread is selectively forced to single step the execution of software instructions from a work packet granule. A “single step” packet is associated with a work packet granule. The work packet granule, with the associated “single step” packet, is dispatched as an appended work packet granule to a preselected hardware thread in a processor core, which, in one embodiment, is located at a node in a Network On a Chip (NOC). The work packet granule then executes in a single step mode until completion.

    摘要翻译: 有选择地强制硬件线程从工作包颗粒单步执行软件指令。 “单步”包与工作包颗粒相关联。 具有相关联的“单步”分组的工作分组粒子被作为附加的工作分组粒子被分派到处理器核心中的预选硬件线程,在一个实施例中,其在一个实施例中位于网络片上( NOC)。 工作包颗粒然后以单步模式执行直到完成。