摘要:
A distributed architecture for the input/output system for a multiprocessor system provides for equal and democratic access to all shared hardware resources by both the processors and the external interface ports of the multiprocessor system. This allows one or more input/output concentrators attached to the external interface ports to have complete access to all of the shared hardware resources across the multiprocessor system without requiring processor intervention. The distributed input/output system provides for communication of data and control information between a set of common shared hardware resources and a set of external data sources. The result is a highly parallel multiprocessor system that has multiple parallel high performance input/output ports capable of operating in a distributed fashion. The distributed architecture also permits both processors and external interface ports to access all of the shared data structures associated with the operating system that contain the input/output control information.
摘要:
A cluster architecture for a highly parallel multiprocessor computer processing system is comprised of one or more clusters of tightly-coupled, high-speed processors capable of both vector and scalar parallel processing that can symmetrically access shared resources associated with the cluster, as well as the shared resources associated with other clusters.
摘要:
A method of accessing common memory in a cluster architecture for a highly parallel multiprocessor scaler/factor computer system using a plurality of segment registers in which is first determined whether a logical address is within a start and end range as defined by the segment registers and then relocating the logical address to a physical address using a displacement value in another segment register.
摘要:
An improved high performance hardwired supercomputer data processing apparatus includes instruction means adpated to issue one and two parcel instructions. Instruction fetch means provides an instruction stream of two parcel items in sequence. Instruction decode means is responsive to each two parcel item for determining in one clock cycle whether the two parcel item is a single two parcel instruction or two one parcel instructions, for issuing each two parcel instruction for execution during the one clock cycle, and for issuing one then the other of the two one parcel instructions for execution in sequence during the one clock cycle and the next succeeding clock cycle.
摘要:
The present invention is an improved high performance scalar/vector processor. In the preferred embodiment, the scalar/vector processor is used in a multiprocessor system. The scalar/vector processor is comprised of a scalar processor for operating on scalar and logical instructions, including a plurality of independent functional units operably connected to the scalar processor, a vector processor for operating on vector instructions, including a plurality of independent functional units operably connected to the vector processor, and an instruction control mechanism for fetching both the scalar and vector instructions from an instruction cache and controlling the operation of those instructions in both the scalar and vector processor. The instruction control mechanism is designed to enhance the performance of the scalar/vector processor by keeping a multiplicity of pipelines substantially filled with a minimum number of gaps.
摘要:
The present invention is an improved high performance scalar/vector processor. In the preferred embodiment, the scalar/vector processor is used in a multiprocessor system. The scalar/vector processor is comprised of a scalar processor for operating on scalar and logical instructions, including a plurality of independent functional units operably connected to the scalar processor, a vector processor for operating on vector instructions, including a plurality of independent functional units operably connected to the vector processor, and an instruction control mechanism for fetching both the scalar and vector instructions from an instruction cache and controlling the operation of those instructions in both the scalar and vector processor. The instruction control mechanism is designed to enhance the performance of the scalar/vector processor by keeping a multiplicity of pipelines substantially filled with a minimum number of gaps.
摘要:
The present invention is an improved high performance scalar/vector processor. In the preferred embodiment, the scalar/vector processor is used in a multiprocessor system. The scalar/vector processor is comprised of a scalar processor for operating on scalar and logical instructions, including a plurality of independent functional units operably connected to the scalar processor, a vector processor for operating on vector instructions, including a plurality of independent functional units operably connected to the vector processor, and an instruction control mechanism for fetching both the scalar and vector instructions from an instruction cache and controlling the operation of those instructions in both the scalar and vector processor. The instruction control mechanism is designed to enhance the performance of the scalar/vector processor by keeping a multiplicity of pipelines substantially filled with a minimum number of gaps.
摘要:
A delayed branch mechanism maintains the flow of an instruction pipeline in a scalar/vector processor having an instruction cache and including instruction fetch means, a program counter, and instruction decode/issue means coupled to the instruction cache by means of the instruction pipeline. Conditional branch instructions are rated as likely conditional branch instructions or unlikely conditional branch instructions based on a probability that their branch conditions will be met. A number of pipeline clock periods required for testing the branch conditions are determined. The likely conditional branch instructions are issued and executed including transferring a branch-to-address to the program counter during the number of pipeline clock periods irrespective of a successful meeting of the branch conditions. A number of useful instructions sufficient to issue within the number of pipeline clock periods are placed into the instruction stream following the likely conditional branch instructions. A conditional branch instruction is canceled and returned to an instruction which would have followed the conditional branch instruction if the branch is not taken. No gap occurs in the instruction stream if the corresponding branch is successfully taken.
摘要:
A sequence of conditional vector IF statements is processed by employing a mask register and a condition register. Each conditional vector IF statement is typically performed on two vector registers, each having vector elements. A first conditional vector IF statement in the sequence is processed for those vector elements corresponding to set bits in the mask register. Bits are set in the condition register to reflect those vector elements which correspond to the set bits in the mask register for which the conditional vector IF statement is satisfied. The contents of the condition register are then moved into the mask register. A next conditional vector IF statement in the sequence is then processed for those vector elements corresponding to the new set bits in the mask register. Bits are then set in the condition register to reflect those vector elements which correspond to the new set bits in the mask register for which the conditional vector IF statement is satisfied.
摘要:
A scalar/vector processor capable of concurrent scaler and vector operations includes scalar resources to process scalar instructions, and vector resources adapted to be operated concurrently with the scalar resources and with one another to process vector instructions. The scalar resources include scalar registers, and the vector resources include vector registers. Decoding means decodes each of a number of address fields. Each field represents a register address to access alternatively one of the scalar registers or one of the vector registers depending on a value of the register address being above or below a selected moveable address value within a range of addresses encompassed by the address field.