摘要:
A shared memory parallel processing system interconnected by a multi-stage network (20) combines new system configuration techniques with special-purpose hardware to provide remote memory accesses across the network, while controlling cache coherency efficiently across the network. The system configuration techniques include a systematic method for partitioning and controlling the memory (54) in relation to local verses remote accesses and changeable verses unchangeable data. Most of the special-purpose hardware is implemented in the memory controller (210) and network adapter (10), which implements three send FIFOs (40, 41 and 42) and three receive FIFOs (44, 45 and 46) at each node (34) to segregate and handle efficiently invalidate functions, remote stores, and remote accesses requiring cache coherency. The segregation of these three functions into different send and receive FIFOs greatly facilitates the cache coherency function over the network. In addition, the network itself is tailored to provide the best efficiency for remote accesses.
摘要:
Disclosed is multi-media switching apparatus for performing digital, analog, and/or optical communications amongst multiple nodes over switching networks. The key aspect of the present invention is the full parallel aspect of the switching apparatus which supports n simultaneously, low-latency connections, where n is the number of functional elements interconnected by the switching network. Any of the n simultaneous transmissions can be digital, analog, or optical in any proportion. In addition, the present invention can also serve as a high-speed distributed controller for the purpose of selecting analog or optical switches for information transfer between elements of the system.
摘要:
A pipelined instruction execution system including a microstore (20a) for storing sequences of microinstruction addresses associated with each macroinstruction, a nanostore (20b) for randomly storing unique microinstructions, and an execution unit (30) for executing the microinstructions is provided with a no-op/prefetch apparatus (50), which prevents a microinstruction address, stored in the microstore, from accessing the nanostore and forces a no-op address into the nanostore when the execution unit executes a conditional microbranch instruction. During the execution of the no-op microinstruction in the execution unit, the no-op/prefetch apparatus permits either the next sequential microinstruction address following the conditional microbranch instruction to access the nanostore or another non-sequential microinstruction address to access the nanostore, the selection of the next sequential microinstruction address or said another non-sequential microinstruction depending upon the outcome of the execution of the conditional microbranch instruction by the execution unit.
摘要:
A shared memory parallel processing system interconnected by a multi-stage network (20) combines new system configuration techniques with special-purpose hardware to provide remote memory accesses across the network, while controlling cache coherency efficiently across the network. The system configuration techniques include a systematic method for partitioning and controlling the memory (54) in relation to local verses remote accesses and changeable verses unchangeable data. Most of the special-purpose hardware is implemented in the memory controller (210) and network adapter (10), which implements three send FIFOs (40, 41 and 42) and three receive FIFOs (44, 45 and 46) at each node (34) to segregate and handle efficiently invalidate functions, remote stores, and remote accesses requiring cache coherency. The segregation of these three functions into different send and receive FIFOs greatly facilitates the cache coherency function over the network. In addition, the network itself is tailored to provide the best efficiency for remote accesses.
摘要:
Disclosed is a conversion apparatus that converts and adapts standard processor bus protocol and architecture, such as the MicroChannel (IBM Trade mark) bus, to more progressive switch interconnection protocol and architecture. The ivention extends existing the bus-based architecture to perform parallel and clustering functions by enabling the interconnection of thousands of processors. A conversion apparatus is disclosed for controlling the transfer of data messages from one nodal element across a switch network to another nodal element by using direct memory access capabilities controlled by intelligent bus masters. This approach does not require interactive support from teh processor at either nodal element during the message transmision, and frees up both processors to perform other tasks. In addition, the communication media is switch-based and is fully parallel, supporting n transmissions simultaneously, where n is the number of nodes interconnected by the switching network.
摘要:
Disclosed is a modularly expandable switch-based planar apparatus for inserting multiple bus-based processor cards and/or expansion cards and interconnecting the said cards via a multi-stage switch network which resides on the invention planar. The switching network is built into the planar. The cards require no modification or change of any kind, since the connection to the planar is made as if the planar contained the standard MicroChannel interconnection. However, the disclosed planar implements bus converter units to convert the standard bus interface provided by the cards to the switch network interface, so that functions provided by the cards can communicate in parallel over the switch network.
摘要:
A method and hardware apparatus provide a fault tolerant and flexible multi-stage network addressing scheme for transmitting a message with a header containing control bits for selecting from various destination checking functions to be performed. Upon arrival of the message at a node, destination checking is performed or not in response to the message's header. If destination checking is not performed, or if destination checking is performed and indicates that the node is the desired destination for the message, the message is accepted. If destination checking is performed and indicates that the node is not the desired destination for the message, the message is rejected. Destination checking is disabled during address assignment, broadcasting and multi-casting, and replaced with one's complement-based verification of the sending node.
摘要:
Disclosed is a new torus switch with low latency performance. The present invention improves the torus network connection time by providing the capability to try multipaths in one single high speed operation. This multipath approach can be directed at establishing a connection between two specific nodes over various alternate routes simultaneously. The invention is such that if only one route is available, the multipath approach will find that path instantanteously and establish the desired connection with minimal latency. If several links are available, the multipath method establishes the desired connection over only one of the available links and leaves the other options free to be used by other connections. In addition, routing at intermediate torus network stages will be a vast improvement of the wormhole approach.
摘要:
Disclosed is an implementation of a high priority path that is in addition to the normal low priority path through a multi-stage switching network. The high priority path is established at the quickest possible speed because the high priority command is stored at the switch stage involved and made on a priority basis as soon as output port required becomes available. In addition, a positive feedback is given to the node establishing the connection immediately upon the making of the connection so that it may proceed at the earliest possible moment. The high priority path is capable of processing multiple high priority pending requests, and resolving the high priority contention using a snapshot register which implements a rotating priority such that no one requesting device can ever be locked out or experience data starvation. A dual priority switching apparatus with input port connections to output port connections uses an asynchronous means to resolve contention under low priority and the absence of blockage conditions, and switches automatically to a priority driven synchronous means of resolving contention under the presence of blockage and high priority conditions. The disclosed improvement to the ALL-NODE (Asynchronous, Low Latency inter-NODE) Switch permits contention to be detected and resolved on chip in either a low or high priority mode, and yet the logic implementation is extremely simple and low in gate count, so the switch design is never gate limited. The protocol requires several parallel data lines plus four control lines so that the switching apparatus can used for networks having a plurality of nodes, each node having a plurality of input and output ports, with a a multiplexer control circuit for each output port for connecting any of I inputs to any of Z outputs, where I and Z can assume any unique value greater or equal to two, and a different priority level is assigned to a function. The switch has a single physical network path element over which either a low priority or high priority path can be established.
摘要:
Disclosed is an implementation of a high priority path that is in addition to the normal low priority path through a multi-stage switching network. The high priority path is established at the quickest possible speed because the high priority command is stored at the switch stage involved and made on a priority basis as soon as output port required becomes available. In addition, a positive feedback is given to the node establishing the connection immediately upon the making of the connection so that it may proceed at the earliest possible moment. The high priority path is capable of processing multiple high priority pending requests, and resolving the high priority contention using a snapshot register which implements a rotating priority such that no one requesting device can ever be locked out or experience data starvation. A dual priority switching apparatus with input port connections to output port connections uses an asynchronous means to resolve contention under low priority and the absence of blockage conditions, and switches automatically to a priority driven synchronous means of resolving contention under the presence of blockage and high priority conditions. The disclosed improvement to the ALL-NODE (Asynchronous, Low Latency inter-NODE) Switch permits contention to be detected and resolved on chip in either a low or high priority mode, and yet the logic implementation is extremely simple and low in gate count, so the switch design is never gate limited. The protocol requires several parallel data lines plus four control lines so that the switching apparatus can used for networks having a plurality of nodes, each node having a plurality of input and output ports, with a a multiplexer control circuit for each output port for connecting any of I inputs to any of Z outputs, where I and Z can assume any unique value greater or equal to two, and a different priority level is assigned to a function. The switch has a single physical network path element over which either a low priority or high priority path can be established.