摘要:
A unified parallel processing architecture connects together an extendible number of clusters of multiple numbers of processors to create a high performance parallel processing computer system. Multiple processors are grouped together into four or more physically separable clusters, each cluster having a common cluster shared memory that is symmetrically accessible by all of the processors in that cluster; however, only some of the clusters are adjacently interconnected. Clusters are adjacently interconnected to form a floating shared memory if certain memory access conditions relating to relative memory latency and relative data locality can create an effective shared memory parallel programming environment. A shared memory model can be used with programs that can be executed in the cluster shared memory of a single cluster, or in the floating shared memory that is defined across an extended shared memory space comprised of the cluster shared memories of any set of adjacently interconnected clusters. A distributed memory model can be used with any programs that are to be executed in the cluster shared memories of any non-adjacently interconnected clusters. The adjacent interconnection of multiple clusters of processors to a create a floating shared memory effectively combines all three type of memory models, pure shared memory, extended shared memory and distributed shared memory, into a unified parallel processing architecture.
摘要:
A cluster architecture for a highly parallel multiprocessor computer processing system is comprised of one or more clusters of tightly-coupled, high-speed processors capable of both vector and scalar parallel processing that can symmetrically access shared resources associated with the cluster, as well as the shared resources associated with other clusters.
摘要:
A method of accessing common memory in a cluster architecture for a highly parallel multiprocessor scaler/factor computer system using a plurality of segment registers in which is first determined whether a logical address is within a start and end range as defined by the segment registers and then relocating the logical address to a physical address using a displacement value in another segment register.
摘要:
A global register system provides communication and coordination among a plurality of processors sharing a common memory in a multiprocessor system which access one or more registers within a shared resource circuit that is separate from the common memory and is symmetrically accessible by the plurality of processors in the multiprocessor system. The global register system is accessed by direct addresses determined by the processor from a previously assigned indirect address and an instruction accessing the data stored in global registers. Arithmetic or logic operation on a data value stored in a selected one of the registers are performed by the global register system independent from the processors or the common memory in order to modify the data value in the selected global register as part of an atomic operation performed in response to a single read-and-modify instruction received from one of the processors.
摘要:
An improved high performance hardwired supercomputer data processing apparatus includes instruction means adpated to issue one and two parcel instructions. Instruction fetch means provides an instruction stream of two parcel items in sequence. Instruction decode means is responsive to each two parcel item for determining in one clock cycle whether the two parcel item is a single two parcel instruction or two one parcel instructions, for issuing each two parcel instruction for execution during the one clock cycle, and for issuing one then the other of the two one parcel instructions for execution in sequence during the one clock cycle and the next succeeding clock cycle.
摘要:
The present invention is an improved high performance scalar/vector processor. In the preferred embodiment, the scalar/vector processor is used in a multiprocessor system. The scalar/vector processor is comprised of a scalar processor for operating on scalar and logical instructions, including a plurality of independent functional units operably connected to the scalar processor, a vector processor for operating on vector instructions, including a plurality of independent functional units operably connected to the vector processor, and an instruction control mechanism for fetching both the scalar and vector instructions from an instruction cache and controlling the operation of those instructions in both the scalar and vector processor. The instruction control mechanism is designed to enhance the performance of the scalar/vector processor by keeping a multiplicity of pipelines substantially filled with a minimum number of gaps.
摘要:
The present invention is an improved high performance scalar/vector processor. In the preferred embodiment, the scalar/vector processor is used in a multiprocessor system. The scalar/vector processor is comprised of a scalar processor for operating on scalar and logical instructions, including a plurality of independent functional units operably connected to the scalar processor, a vector processor for operating on vector instructions, including a plurality of independent functional units operably connected to the vector processor, and an instruction control mechanism for fetching both the scalar and vector instructions from an instruction cache and controlling the operation of those instructions in both the scalar and vector processor. The instruction control mechanism is designed to enhance the performance of the scalar/vector processor by keeping a multiplicity of pipelines substantially filled with a minimum number of gaps.
摘要:
Methods and apparatus for a maintenance and control system for sensing and controlling the numerous sections of a highly parallel multiprocessor system. The control and maintenance system communicates with all processors, all peripheral systems, all user interfaces to the multiprocessor system, a system console, and the power and environmental control subsystems.
摘要:
A signaling mechanism for sending and receiving signals to and from any one of all of a plurality of devices, including peripheral controllers and processors, in a multiprocessor system. The signaling mechanism includes two switches, a first switch routing a signal command generated by the device to a signal dispatch logic and a second switch for receiving signals generated by the signal dispatch logic and routing the signals to the selected device. The signal dispatch logic receiving the signal command, decodes the destination select value and generates a signal to be sent to the selected device. The signal command includes a destination select value representing a device selectably determined by the device. The signaling mechanism also includes an arbitration mechanism connected to the signal dispatch logic and the first switch for resolving simultaneous conflicting signal commands issued by two or more devices. The signal generated by the signal dispatch logic may include a plurality of bits representing one or more types of predefined signals to be acted upon by the device.
摘要:
The present invention is an improved high performance scalar/vector processor. In the preferred embodiment, the scalar/vector processor is used in a multiprocessor system. The scalar/vector processor is comprised of a scalar processor for operating on scalar and logical instructions, including a plurality of independent functional units operably connected to the scalar processor, a vector processor for operating on vector instructions, including a plurality of independent functional units operably connected to the vector processor, and an instruction control mechanism for fetching both the scalar and vector instructions from an instruction cache and controlling the operation of those instructions in both the scalar and vector processor. The instruction control mechanism is designed to enhance the performance of the scalar/vector processor by keeping a multiplicity of pipelines substantially filled with a minimum number of gaps.