摘要:
Processing element to processing element switch connection control is described using a receive model that precludes communication hazards from occurring in a synchronous MIMD mode of operation. Such control allows different communication topologies and various processing effects such as an array transpose, hypercomplement or the like to be efficiently achieved utilizing architectures, such as the manifold array processing architecture. An encoded instruction method reduces the amount of state information and setup burden on the programmer taking advantage of the recognition that the majority of algorithms will use only a small fraction of all possible mux settings available. Thus, by means of transforming the PE identification based upon a communication path specified by a PE communication instruction an efficient switch control mechanism can be used. This control mechanism allows PE register broadcast operations as well as the standard mesh and hypercube communication paths over the same interconnection network. PE to PE communication instructions PEXCHG, SPRECV and SPSEND are also defined and implemented.
摘要:
A manifold array topology includes processing elements, nodes, memories or the like arranged in clusters. Clusters are connected by cluster switch arrangements which advantageously allow changes of organization without physical rearrangement of processing elements. A significant reduction in the typical number of interconnections for preexisting arrays is also achieved. Fast, efficient and cost effective processing and communication result with the added benefit of ready scalability.
摘要:
Processing element to processing element switch connection control is described using a receive model that precludes communication hazards from occurring in a synchronous MIMD mode of operation. Such control allows different communication topologies and various processing effects such as an array transpose, hypercomplement or the like to be efficiently achieved utilizing architectures, such as the manifold array processing architecture. An encoded instruction method reduces the amount of state information and setup burden on the programmer taking advantage of the recognition that the majority of algorithms will use only a small fraction of all possible mux settings available. Thus, by means of transforming the PE identification based upon a communication path specified by a PE communication instruction an efficient switch control mechanism can be used. This control mechanism allows PE register broadcast operations as well as the standard mesh and hypercube communication paths over the same interconnection network. PE to PE communication instructions PEXCHG, SPRECV and SPSEND are also defined and implemented.
摘要:
A manifold array topology includes processing elements, nodes, memories or the like arranged in clusters. Clusters are connected by cluster switch arrangements which advantageously allow changes of organization without physical rearrangement of processing elements. A significant reduction in the typical number of interconnections for preexisting arrays is also achieved. Fast, efficient and cost effective processing and communication result with the added benefit of ready scalability.
摘要:
A manifold array topology includes processing elements, nodes, memories or the like arranged in clusters. Clusters are connected by cluster switch arrangements which advantageously allow changes of organization without physical rearrangement of processing elements. A significant reduction in the typical number of interconnections for preexisting arrays is also achieved. Fast, efficient and cost effective processing and communication result with the added benefit of ready scalability.
摘要:
Processing element to processing element switch connection control is described using a receive model that precludes communication hazards from occurring in a synchronous MIMD mode of operation. Such control allows different communication topologies and various processing effects such as an array transpose, hypercomplement or the like to be efficiently achieved utilizing architectures, such as the manifold array processing architecture. An encoded instruction method reduces the amount of state information and setup burden on the programmer taking advantage of the recognition that the majority of algorithms will use only a small fraction of all possible mux settings available. Thus, by means of transforming the PE identification based upon a communication path specified by a PE communication instruction an efficient switch control mechanism can be used. This control mechanism allows PE register broadcast operations as well as the standard mesh and hypercube communication paths over the same interconnection network. PE to PE communication instructions PEXCHG, SPRECV and SPSEND are also defined and implemented.
摘要:
General purpose flags (ACFs) are defined and encoded utilizing a hierarchical one-, two- or three-bit encoding. Each added bit provides a superset of the previous functionality. With condition combination, a sequential series of conditional branches based on complex conditions may be avoided and complex conditions can then be used for conditional execution. ACF generation and use can be specified by the programmer. By varying the number of flags affected, conditional operation parallelism can be widely varied, for example, from mono-processing to octal-processing in VLIW execution, and across an array of processing elements (PE)s. Multiple PEs can generate condition information at the same time with the programmer being able to specify a conditional execution in one processor based upon a condition generated in a different processor using the communications interface between the processing elements to transfer the conditions. Each processor in a multiple processor array may independently have different units conditionally operate based upon their ACFs.
摘要:
General purpose flags (ACFs) are defined and encoded utilizing a hierarchical one-, two- or three-bit encoding. Each added bit provides a superset of the previous functionality. With condition combination, a sequential series of conditional branches based on complex conditions may be avoided and complex conditions can then be used for conditional execution. ACF generation and use can be specified by the programmer. By varying the number of flags affected, conditional operation parallelism can be widely varied, for example, from mono-processing to octal-processing in VLIW execution, and across an array of processing elements (PE)s. Multiple PEs can generate condition information at the same time with the programmer being able to specify a conditional execution in one processor based upon a condition generated in a different processor using the communications interface between the processing elements to transfer the conditions. Each processor in a multiple processor array may independently have different units conditionally operate based upon their ACFs.
摘要:
General purpose flags (ACFs) are defined and encoded utilizing a hierarchical one-, two- or three-bit encoding. Each added bit provides a superset of the previous functionality. With condition combination, a sequential series of conditional branches based on complex conditions may be avoided and complex conditions can then be used for conditional execution. ACF generation and use can be specified by the programmer. By varying the number of flags affected, conditional operation parallelism can be widely varied, for example, from mono-processing to octal-processing in VLIW execution, and across an array of processing elements (PE)s. Multiple PEs can generate condition information at the same time with the programmer being able to specify a conditional execution in one processor based upon a condition generated in a different processor using the communications interface between the processing elements to transfer the conditions. Each processor in a multiple processor array may independently have different units conditionally operate based upon their ACFs.
摘要:
Details of a highly cost effective and efficient implementation of a manifold array (ManArray) architecture and instruction syntax for use therewith are described herein. Various aspects of this approach include the regularity of the syntax, the relative ease with which the instruction set can be represented in database form, the ready ability with which tools can be created, the ready generation of self-checking codes and parameterized testcases. Parameterizations can be fairly easily mapped and system maintenance is significantly simplified.