Parallel processing system amd method using surrogate instructions
    51.
    发明公开
    Parallel processing system amd method using surrogate instructions 失效
    Paralleles Verarbeitungssystem und Verfahren mit Ersatzbefehlen

    公开(公告)号:EP0723220A2

    公开(公告)日:1996-07-24

    申请号:EP95480181.7

    申请日:1995-12-20

    IPC分类号: G06F9/38

    摘要: A parallel processing system and method is disclosed, which provides an improved instruction distribution mechanism for a parallel processing array. The invention broadcasts a basic instruction to each of a plurality of processor elements. Each processor element decodes the same instruction by combining it with a unique offset value stored in each respective processor element, to produce a derived instruction that is unique to the processor element. A first type of basic instruction results in the processor element performing a logical or control operation. A second type of basic instruction results in the generation of a pointer address. The pointer address has a unique address value because it results from combining the basic instruction with the unique offset value stored at the processor element. The pointer address is used to access an alternative instruction from an alternative instruction storage, for execution in the processor element. The alternative instruction is a very long instruction word, whose length is, for example, an integral multiple of the length of the basic instruction and contains much more information than can be represented by the basic instruction. A very long instruction word such as this is useful for providing parallel control of a plurality of primitive execution units that reside within the processor element. In this manner, a high degree of flexibility and versatility is attained in the operation of processor elements of a parallel processing array.

    摘要翻译: 公开了一种并行处理系统和方法,其提供用于并行处理阵列的改进的指令分配机制。 本发明向多个处理器元件中的每一个广播基本指令。 每个处理器元件通过将其与存储在每个相应的处理器元件中的唯一的偏移值组合来解码相同的指令,以产生对于处理器元件是唯一的导出指令。 第一类基本指令导致处理器元件执行逻辑或控制操作。 第二种类型的基本指令导致生成指针地址。 指针地址具有唯一的地址值,因为它是将基本指令与存储在处理器元件中的唯一偏移值组合而产生的。 指针地址用于从替代指令存储器访问替代指令,以在处理器元件中执行。 替代指令是非常长的指令字,其长度例如是基本指令的长度的整数倍,并且包含比基本指令可以表示的信息多得多的信息。 诸如此类的非常长的指令字对于提供驻留在处理器元件内的多个基本执行单元的并行控制是有用的。 以这种方式,在并行处理阵列的处理器元件的操作中获得高度的灵活性和多功能性。

    A massively parallel diagonal fold tree array processor
    52.
    发明公开
    A massively parallel diagonal fold tree array processor 失效
    一个大型平行对立折叠树丛处理器

    公开(公告)号:EP0569763A3

    公开(公告)日:1994-07-13

    申请号:EP93106730.0

    申请日:1993-04-26

    IPC分类号: G06F15/80

    CPC分类号: G06F15/8023

    摘要: A massively parallel processor apparatus having an instruction set architecture for each of the N ² the PEs of the structure. The apparatus which we prefer will have a PE structure consisting of PEs that contain instruction and data storage units, receive instructions and data, and execute instructions. The N ² structure should contain "N" communicating ALU trees, "N" programmable root tree processor units, and an arrangement for communicating both instructions, data, and the root tree processor outputs back to the input processing elements by means of the communicating ALU trees. The apparatus can be structured as a bit-serial or word parallel system. The preferred structure contains N ² PEs, identified a s PE column,row, in a N root tree processor system, placed in the form of a N by N processor array that has been folded along the diagonal and made up of diagonal cells and general cells. The Diagonal-Cells are comprised of a single processing element identified as PE i,i of the folded N by N processor array and the General-Cells are comprised of two PEs merged together, identified as PE i,j and PE j,i of the folded N by N processor array. Matrix processing algorithms are discussed followed by a presentation of the Diagonal-Fold Tree Array Processor architecture. The Massively Parallel Diagonal-Fold Tree Array Processor supports completely connected root tree processors through the use of the array of PEs that are interconnected by folded communication ALU trees.

    Scalable compound instruction set machine architecture
    53.
    发明公开
    Scalable compound instruction set machine architecture 失效
    可扩展化合物指导设置机器结构

    公开(公告)号:EP0454985A3

    公开(公告)日:1994-03-30

    申请号:EP91104323.0

    申请日:1991-03-20

    IPC分类号: G06F9/38 G06F15/80

    摘要: Described is a scalable compound instruction set machine and method which provides for processing a set of instructions or program to be executed by a computer to determine statically which instructions may be combined into compound instructions which are executed in parallel by a scalar machine. Such processing looks for classes of instructions that can be executed in parallel without data-dependent or hardware-dependent interlocks. Without regard to their original sequence the individual instructions are combined with one or more other individual instructions to form a compound instruction which eliminates interlocks. Control information is appended to identify information relevant to the execution of the compound instructions. The result is a stream of scalar instructions compounded or grouped together before instruction decode time so that they are already flagged and identified for selective simultaneous parallel execution by execution units. The compounding does not change the object code results and existing programs realize performance improvements while maintaining compatibility with previously implemented systems for which the original set of instructions was provided.

    Compounding preprocessor for cache
    54.
    发明公开
    Compounding preprocessor for cache 失效
    用于缓存的组合预处理程序

    公开(公告)号:EP0455966A3

    公开(公告)日:1993-11-10

    申请号:EP91104318.0

    申请日:1991-03-20

    IPC分类号: G06F9/38

    摘要: A digital computer system is described capable of processing two or more computer instructions in parallel and having a cache storage unit for temporarily storing machine-level computer instructions in their journey from a higher-level storage unit of the computer system to the functional units which process the instructions. The computer system includes an instruction compounding unit located intermediate to the higher-level storage unit and the cache storage unit for analyzing the instructions and adding to each instruction a tag field which indicates whether or not that instruction may be processed in parallel with one or more neighboring instructions in the instruction stream. These tagged instructions are then stored in the cache unit. The computer system further includes a plurality of functional instruction processing units which operate in parallel with one another. The instructions supplied to these functional units are obtained from the cache storage unit. At instruction issue time, the tag fields of the instructions are examined and those tagged for parallel processing are sent to different ones of the functional units in accordance with the codings of their operation code fields.

    Massively parallel array processor
    55.
    发明公开
    Massively parallel array processor 失效
    Massiv平行机ArrayProzessor。

    公开(公告)号:EP0564847A2

    公开(公告)日:1993-10-13

    申请号:EP93104154.5

    申请日:1993-03-15

    IPC分类号: G06F15/80

    CPC分类号: G06F15/8023

    摘要: Image processing for multimedia workstations is a computationally intensive task requiring special purpose hardware to meet the high speed requirements associated with the task. One type of specialized hardware that meets the computation high speed requirements is the mesh connected computer. Such a computer becomes a massively parallel machine when an array of computers interconnected by a network are replicated in a machine. The nearest neighbor mesh computer consists of an N x N square array of Processor Elements(PEs) where each PE is connected to the North, South, East and West PEs only. Assuming a single wire interface between PEs, there are a total of 2N² wires in the mesh structure. Under the assumtion of SIMD operation with uni-directional message and data transfers between the processing elements in the meah, for example all PES transferring data North, it is possible to reconfigure the array by placing the symmetric processing elements together and sharing the north-south wires with the east-west wires, thereby reducing the wiring complexity in half, i.e. N² without affecting performance. The resulting diagonal folded mesh array processor, which is called Oracle, allows the matrix transformation operation to be accomplished in one cycle by simple interchange of the data elements in the dual symmetric processor elements. The use of Oracle for a parallel 2-D convolution mechanish for image processing and multimedia applications and for a finite difference method of solving differential equations is presented, concentrating on the computational aspects of the algorithm.

    摘要翻译: 多媒体工作站的图像处理是一项计算密集型任务,需要专用硬件来满足与任务相关的高速度要求。 一种满足计算高速要求的专用硬件是网状计算机。 当通过网络互连的计算机阵列在机器中复制时,这样的计算机变成大规模并行机器。 最近邻网格计算机由处理器元素(PE)的N×N个正方形阵列组成,其中每个PE仅连接到北,南,东和西PE。 假设PE之间的单线接口,网格结构中总共有2N条2线。 在SIMD操作的假设下,单向消息和数据传输在meah中的处理元素之间,例如所有PES传输数据北部,可以通过将对称处理元素放在一起并共享南北部来重新配置阵列 电线与东西电线,从而将布线复杂度降低一半,即N <2而不影响性能。 所得到的对称折叠网格阵列处理器(称为Oracle)允许通过双对称处理器元件中的数据元素的简单交换在一个周期内完成矩阵变换操作。 提出了使用Oracle进行图像处理和多媒体应用的并行2-D卷积机制,并提出了一种求解微分方程的有限差分方法,重点是算法的计算方面。

    High performance divider with a sequence of convergence factors
    58.
    发明公开
    High performance divider with a sequence of convergence factors 失效
    Hochleistungsdividierer mit einer Reihe von Konvergenzfaktoren。

    公开(公告)号:EP0499705A2

    公开(公告)日:1992-08-26

    申请号:EP91121058.1

    申请日:1991-12-09

    IPC分类号: G06F7/52

    CPC分类号: G06F7/535 G06F2207/5355

    摘要: A system for dividing a digital dividend operand N by a digital divisor operand D to obtain a quotient operand Q with minimal execution time and hardware calculates a value NP₀P₁...P m , where the value P₀P₁...P m has a magnitude such that NP₀P₁...P m converges to Q and DP₀P₁ converges to 1. The divider employs a one's complementation, multiplication and addition sequence to calculate the value NP₀P₁...P m .

    摘要翻译: 用数字除数操作数D将数字除数操作数N除以以最小执行时间和硬件获得商操作数Q的系统计算值NP0P1 ... Pm,其中值P0P1 ... Pm具有使得NP0P1 ... Pm收敛到Q,DP0P1收敛到1.分频器采用一个补码,乘法和相加序列来计算值NP0P1 ... Pm。

    Overflow determination for three-operand alus in a scalable compound instruction set machine
    59.
    发明公开
    Overflow determination for three-operand alus in a scalable compound instruction set machine 失效
    溢出检测在具有可缩放,zusammensetzbarem指令集计算机三操作数的ALU。

    公开(公告)号:EP0487814A2

    公开(公告)日:1992-06-03

    申请号:EP91105242.1

    申请日:1991-04-03

    IPC分类号: G06F7/50

    摘要: A mechanism is presented for detecting overflow in an interlock collapsing hardware apparatus that simultaneously executes two instructions. The overflow is determined as if the second instruction executes by itself using results from execution of the first instruction. Overflow detection is accomplished by using only values input into, and generated within, the interlock collapsing apparatus.

    摘要翻译: 一种机制,提出了到联锁倒塌硬件设备检测溢出并同时执行两条指令。 溢流是确定性的开采仿佛第二指令通过本身使用来自第一指令的执行结果执行。 溢出检测是通过使用仅完成值输入到,和内产生,互锁折叠装置。

    Compounding preprocessor for cache
    60.
    发明公开
    Compounding preprocessor for cache 失效
    Vorverarbeitungsprozessor zur Verbindung von Befehlenfüreinen Cache-Speicher。

    公开(公告)号:EP0455966A2

    公开(公告)日:1991-11-13

    申请号:EP91104318.0

    申请日:1991-03-20

    IPC分类号: G06F9/38

    摘要: A digital computer system is described capable of processing two or more computer instructions in parallel and having a cache storage unit for temporarily storing machine-level computer instructions in their journey from a higher-level storage unit of the computer system to the functional units which process the instructions. The computer system includes an instruction compounding unit located intermediate to the higher-level storage unit and the cache storage unit for analyzing the instructions and adding to each instruction a tag field which indicates whether or not that instruction may be processed in parallel with one or more neighboring instructions in the instruction stream. These tagged instructions are then stored in the cache unit. The computer system further includes a plurality of functional instruction processing units which operate in parallel with one another. The instructions supplied to these functional units are obtained from the cache storage unit. At instruction issue time, the tag fields of the instructions are examined and those tagged for parallel processing are sent to different ones of the functional units in accordance with the codings of their operation code fields.

    摘要翻译: 描述了能够并行处理两个或更多个计算机指令的数字计算机系统,并且具有高速缓存存储单元,用于在从计算机系统的更高级存储单元到功能单元的过程中临时存储机器级计算机指令 说明。 计算机系统包括位于上级存储单元的中间的指令复合单元和用于分析指令的高速缓存存储单元,并向每个指令添加指示该指令是否可以与一个或多个并行处理的标签字段 指令流中的相邻指令。 然后将这些标记的指令存储在高速缓存单元中。 计算机系统还包括彼此并行操作的多个功能指令处理单元。 提供给这些功能单元的指令从缓存存储单元获得。 在指令发布时,检查指令的标签字段,并根据其操作码字段的编码将用于并行处理的标签字段发送到不同的功能单元。