Manifold Array Processor
    3.
    发明申请
    Manifold Array Processor 审中-公开
    歧管阵列处理器

    公开(公告)号:US20130019082A1

    公开(公告)日:2013-01-17

    申请号:US13616942

    申请日:2012-09-14

    IPC分类号: G06F15/80

    摘要: An array processor includes processing elements arranged in to form a rectangular array. Inter-cluster communication paths are mutually exclusive. Due to the mutual exclusivity of the data paths, communications between the processing elements of each cluster may be combined in a single inter-cluster path, thus eliminating half the wiring required for the path. The length of the longest communication path is not directly determined by the overall dimension of the array, as in conventional torus arrays. Rather, the longest communications path is limited by the inter-cluster spacing. Transpose elements of an N×N torus may be combined in clusters and communicate with one another through intra-cluster communications paths. Transpose operation latency is eliminated in this approach. Each PE may have a single transmit port and a single receive port. Thus, the individual PEs are decoupled from the array topology.

    摘要翻译: 阵列处理器包括布置成形成矩形阵列的处理元件。 群集间通信路径是互斥的。 由于数据路径的相互独占性,每个集群的处理元件之间的通信可以组合在单个集群间路径中,从而消除路径所需的一半接线。 最长通信路径的长度不直接取决于阵列的整体尺寸,如在常规环形阵列中。 相反,最长的通信路径受群间间隔的限制。 N×N环面的移位元素可以组合在一起,并通过群内通信路径相互通信。 这种方法消除了转置操作延迟。 每个PE可以具有单个发送端口和单个接收端口。 因此,各个PE与阵列拓扑分离。

    Methods and Apparatus for Scalable Array Processor Interrupt Detection and Response
    5.
    发明申请
    Methods and Apparatus for Scalable Array Processor Interrupt Detection and Response 有权
    用于可扩展阵列处理器中断检测和响应的方法和装置

    公开(公告)号:US20080222333A1

    公开(公告)日:2008-09-11

    申请号:US12120543

    申请日:2008-05-14

    IPC分类号: G06F13/24 G06F9/30 G06F9/312

    摘要: Hardware and software techniques for interrupt detection and response in a scalable pipelined array processor environment are described. Utilizing these techniques, a sequential program execution model with interrupts can be maintained in a highly parallel scalable pipelined array processing containing multiple processing elements and distributed memories and register files. When an interrupt occurs, interface signals are provided to all PEs to support independent interrupt operations in each PE dependent upon the local PE instruction sequence prior to the interrupt. Processing/element exception interrupts are supported and low latency interrupt processing is also provided for embedded systems where real time signal processing is required. Further, a hierarchical interrupt structure is used allowing a generalized debug approach using debut interrupts and a dynamic debut monitor mechanism.

    摘要翻译: 描述了可扩展流水线阵列处理器环境中的中断检测和响应的硬件和软件技术。 利用这些技术,可以在包含多个处理元件和分布式存储器和寄存器文件的高度并行的可扩展流水线阵列处理中维持具有中断的顺序程序执行模型。 当发生中断时,接口信号提供给所有PE,以支持每个PE中的独立中断操作,取决于中断前的本地PE指令序列。 支持处理/元件异常中断,并为需要实时信号处理的嵌入式系统提供低延迟中断处理。 此外,使用分层中断结构,允许使用初次中断的通用调试方法和动态登场监视机制。

    Methods and apparatus for scalable array processor interrupt detection and response
    6.
    发明授权
    Methods and apparatus for scalable array processor interrupt detection and response 失效
    用于可扩展阵列处理器中断检测和响应的方法和装置

    公开(公告)号:US06842811B2

    公开(公告)日:2005-01-11

    申请号:US09791256

    申请日:2001-02-23

    摘要: Hardware and software techniques for interrupt detection and response in a scalable pipelined array processor environment are described. Utilizing these techniques, a sequential program execution model with interrupts can be maintained in a highly parallel scalable pipelined array processing containing multiple processing elements (PEs) and distributed memories and register files. When an interrupt occurs, interface signals are provided to all PEs to support independent interrupt operations in each PE dependent upon the local PE instruction sequence prior to the interrupt. Processing/element exception interrupts are supported and low latency interrupt processing is also provided for embedded systems where real time signal processing is required. Further, a hierarchical interrupt structure is used allowing a generalized debug approach using debug interrupts and a dynamic debuts monitor mechanism.

    摘要翻译: 描述了可扩展流水线阵列处理器环境中的中断检测和响应的硬件和软件技术。 利用这些技术,可以在包含多个处理元件(PE)和分布式存储器和寄存器文件的高度并行的可扩展流水线阵列处理中维持具有中断的顺序程序执行模型。 当发生中断时,接口信号提供给所有PE,以支持每个PE中的独立中断操作,取决于中断前的本地PE指令序列。 支持处理/元件异常中断,并为需要实时信号处理的嵌入式系统提供低延迟中断处理。 此外,使用分层中断结构,允许使用调试中断的通用调试方法和动态初始化监视机制。

    Methods and apparatus for manifold array processing
    7.
    发明授权
    Methods and apparatus for manifold array processing 有权
    用于歧管阵列处理的方法和装置

    公开(公告)号:US06769056B2

    公开(公告)日:2004-07-27

    申请号:US10254049

    申请日:2002-09-24

    IPC分类号: G06F1500

    CPC分类号: G06F15/8023

    摘要: A manifold array topology includes processing elements, nodes, memories or the like arranged in clusters. Clusters are connected by cluster switch arrangements which advantageously allow changes of organization without physical rearrangement of processing elements. A significant reduction in the typical number of interconnections for preexisting arrays is also achieved. Fast, efficient and cost effective processing and communication result with the added benefit of ready scalability.

    摘要翻译: 歧管阵列拓扑包括以簇排列的处理元件,节点,存储器等。 集群通过集群交换机布置连接,其有利地允许组织的改变而不需要处理元件的物理重排。 也实现了预先存在的阵列的典型互连数量的显着减少。 快速,高效和经济高效的处理和通信带来了可扩展性的附加优势。

    Methods and apparatus for loading a very long instruction word memory
    8.
    发明授权
    Methods and apparatus for loading a very long instruction word memory 有权
    用于加载非常长的指令字存储器的方法和装置

    公开(公告)号:US06704857B2

    公开(公告)日:2004-03-09

    申请号:US09747056

    申请日:2000-12-22

    IPC分类号: G06F1500

    摘要: The ManArray processor is a scalable indirect VLIW array processor that defines two preferred architectures for indirect VLIW memories. One approach treats the VIM as one composite block of memory using one common address interface to access any VLIW stored in the VIM. The second approach treats the VIM as made up of multiple smaller VIMs each individually associated with the functional units and each individually addressable for loading and reading during XV execution. The VIM memories, contained in each processing element (PE), are accessible by the same type of LV and XV Short Instruction Words (SIWs) as in a single processor instantiation of the indirect VLIW architecture. In the ManArray architecture, the control processor, also called a sequence processor (SP), fetches the instructions from the SIW memory and dispatches them to itself and the PEs. By using the LV instruction, VLIWs can be loaded into VIMs in the SP and the PEs. Since the LV instruction is supplied by the SP through the instruction stream, when VLIWs are being loaded into any VIM no other processing takes place. In addition, as defined in the ManArray architecture, when the SP is processing SIWs, such as control and other sequential code, the PE array is not executing any instructions. Techniques are provided herein to independently load the VIMs concurrent with SIW or iVLIW execution on the SP or on the PEs thereby allowing the load latency to be hidden by the computation.

    摘要翻译: ManArray处理器是可扩展的间接VLIW阵列处理器,它定义了间接VLIW存储器的两种优选架构。 一种方法将VIM视为一个复合的存储器块,使用一个公共地址接口访问存储在VIM中的任何VLIW。 第二种方法将VIM视为由功能单元单独关联的多个较小的VIM组成,并且每个VIM单独可寻址以在XV执行期间进行加载和读取。 包含在每个处理元件(PE)中的VIM存储器可以通过与间接VLIW架构的单处理器实例化中相同类型的LV和XV短指令字(SIW)来访问。 在ManArray架构中,控制处理器(也称为序列处理器(SP))从SIW存储器中获取指令,并将它们分派给自身和PE。 通过使用LV指令,VLIW可以加载到SP和PE中的VIM中。 由于LV指令由SP通过指令流提供,当VLIW被加载到任何VIM中时,不会发生其他处理。 另外,如ManArray架构所定义的,当SP正在处理SIW(例如控制和其他顺序代码)时,PE阵列不执行任何指令。 本文提供了技术来独立地在SP或PE上独立地加载与SIW或iVLIW执行的VIM,从而允许通过计算隐藏负载等待时间。

    Methods and apparatus for manifold array processing
    9.
    发明授权
    Methods and apparatus for manifold array processing 有权
    用于歧管阵列处理的方法和装置

    公开(公告)号:US06470441B1

    公开(公告)日:2002-10-22

    申请号:US09707209

    申请日:2000-11-06

    IPC分类号: G06F1500

    CPC分类号: G06F15/8023

    摘要: A manifold array topology includes processing elements, nodes, memories or the like arranged in clusters. Clusters are connected by cluster switch arrangements which advantageously allow changes of organization without physical rearrangement of processing elements. A significant reduction in the typical number of interconnections for preexisting arrays is also achieved. Fast, efficient and cost effective processing and communication result with the added benefit of ready scalability.

    摘要翻译: 歧管阵列拓扑包括以簇排列的处理元件,节点,存储器等。 集群通过集群交换机布置连接,其有利地允许组织的改变而不需要处理元件的物理重排。 也实现了预先存在的阵列的典型互连数量的显着减少。 快速,高效和经济高效的处理和通信带来了可扩展性的附加优势。

    Methods and apparatus for dynamic instruction controlled reconfigurable register file with extended precision
    10.
    发明授权
    Methods and apparatus for dynamic instruction controlled reconfigurable register file with extended precision 失效
    用于动态指令控制可重配置寄存器文件的方法和装置,具有更高的精度

    公开(公告)号:US06430677B2

    公开(公告)日:2002-08-06

    申请号:US09796037

    申请日:2001-02-28

    IPC分类号: G06F1500

    摘要: A reconfigurable register file integrated in an instruction set architecture capable of extended precision operations, and also capable of parallel operation on lower precision data is described. A register file is composed of two separate files with each half containing half as many registers as the original. The halves are designated even or odd by virtue of the register addresses which they contain. Single width and double width operands are optimally supported without increasing the register file size and without increasing the number of register file ports. Separate extended registers are also employed to provide extended precision for operations such as multiply-accumulate operations.

    摘要翻译: 描述集成在能够进行扩展精度操作并且还能够对较低精度数据进行并行操作的指令集架构中的可重配置寄存器文件。 注册文件由两个单独的文件组成,每个文件的每个文件包含与原始文件一样多的寄存器。 这两半由于它们包含的寄存器地址而被指定为偶数或奇数。 单个宽度和双宽度操作数得到最佳支持,而不增加寄存器文件大小,而不增加寄存器文件端口数量。 还使用单独的扩展寄存器来为诸如乘法累加操作的操作提供扩展精度。