摘要:
Crossbar switches having 2.sup.n +1 ports and computing clusters are arranged so that each crossbar switch is connected to 2.sup.n processors. Auxiliary processors that perform parallel processing administrative functions and input/output functions are arranged at the remainder ports of the crossbar switches. Exchangers are provided to connect each processor and its crossbar switches. Parallel processing may be executed by the 2.sup.n processors independently of processing by the auxiliary processors for speed. One mounting unit is formed of a crossbar switch of one dimension, the processor group connected to that crossbar switch, and all of the crossbar switches of a different dimension that are connected to one of the processors of the one processor group. The parallel processor system is mounted by just combining mounting units with no need for special LSIs or frames or the like on which to mount the crossbar switches and without the interfaces that connect the processor and the network becoming concentrated in one place.
摘要:
In order to determine a transfer path of a message to a receiving-end processor group, a processor includes a routing bit generation circuit, and an exchange switch includes partial broadcast path control circuits and a path control information alteration circuit. In order to define the range of a receiving-end processor group, a network includes transfer control circuits. A crossbar switch includes transfer control circuits associated with output ports and a boundary register group. When a partial broadcast message is transferred from an input port in the downstream direction of an output port, it is decided whether a belonging to the partial broadcast range associated with a connected to the particular input port is connected to the particular output port, whereby the particular partial broadcast message is transferred from the same output port.
摘要:
In a parallel computer, in order to reduce the overhead of data transmissions between the processes, a data transmission from the virtual space of a process in a certain cluster to the virtual space of a process in other cluster is executed without copying the data to the buffer provided within the operating system. The real communication area resident in the real memory is provided in a part of the virtual space of the process, and an identifier unique within the cluster is given to the communication area. When the transmission process has issued a transmission instruction at the time of data transmission, the cluster address of the cluster in which the transmission destination process exists and the identifier of the communication area are determined based on the name of the transmission destination process. Then, the data is directly transmitted between the mutual real communication areas of the transmission originating process and the transmission destination process. Overhead for the data transmission between the processes can be reduced by avoiding making a copy of the data between the user space and the buffer provided within the operating system at the time of data transmission between the processes.
摘要:
In a method of transferring packets in a network for a parallel processor system handling a one-to-one transfer packet to be transferred from a processor to another processor and a broadcast packet to be transferred from a processor to a plurality of other processors, a transfer request of a broadcast packet is preferentially selected and a check is made to detect whether or not a plurality of processors specified as receivers are in a state in which the packet can be received. The broadcast packet is transferred to the processors found to be in the state in which the packet can be received. The packet transfer is delayed for the other processors in a state in which the packet cannot be received. Namely, only when the state of the processors is changed to the state in which the packet can be received, the broadcast packet is transferred thereto.
摘要:
In a parallel processor system comprising a plurality of processor elements constituting a network, a source processor element wishing to broadcast data to a plurality of destination processor elements sends a broadcast request message containing the target data to a broadcast exchanger. The broadcast exchanger converts the received message into a broadcast message and sends it over the network to the destinations. A plurality of broadcast request messages, if transmitted parallelly to the broadcast exchanger, are serialized thereby so that only one broadcast message will be transmitted at a time over the network. This prevents deadlock from occurring between different broadcast messages. The routes for transmitting broadcast request messages and those for transmitting broadcast messages are arranged so as not to overlap with one another. This suppresses deadlock between any broadcast request message and broadcast message. The broadcast exchanger is replaced alternatively with one of the partial networks. These schemes all apply where long messages are transmitted through worm-hole routing.
摘要:
Parallel processors communicate with each other over a network by transmitting messages that include destination processor information. A message controller for each processor in the network receives the messages and checks for faults in the message, particularly in the destination processor number contained in a first word of the message. If a fault occurs in the destination processor number, then the faulty message is transmitted to an appropriate processor for handling the fault. In this way the network operation is not suspended because of the fault and the message is not left in the network as a result of the error occurring in the destination processor number. The processor to which the faulty message is directed is determined by a substitute destination processor number contained in the message or is predetermined and set in another way, such as by a service processor. To recover from the fault, the processor receiving the faulty message can request that the message be retransmitted or the error can be corrected using an ECC, for example. If the faulty message cannot be retransmitted, then the processor or the host processor can request that the job to which the faulty message pertains be canceled by all of the processors executing that job without affecting the simultaneous execution of other jobs by the same processors.
摘要:
A computer system including a plurality of processing nodes, at least one resource provided for use by any of the processing nodes and a plurality of register sets. Each register set is provided in each processing node for storing in parallel use status information indicating whether the resource is in exclusive use status. The computer system includes a plurality of request issue circuits, each being provided in each processing node, for issuing requests for exclusive use of the resource, a message exchanging circuit for serializing requests issued by the request issue circuits into a serialized order and broadcasting the request to the processing nodes and a plurality of status control circuits. Each status control circuit is provided in each processing node to update a corresponding register set depending on use status information and each request received at a corresponding node.
摘要:
In order to reduce load at a resource managing node for exclusive control of a shared resource, each node has a group of lock state registers each corresponding to one of the nodes. Before one node issues a lock request to a resource managing node, the node checks the register group to see if the resource managing node is unlocked. With the target node found to be accessible, the access requesting node sends to a broadcast message exchange circuit a broadcast request message including a lock request regarding the resource managing node. The broadcast message exchange circuit receives such broadcast request messages from access requesting nodes, and changes them serially into broadcast messages for broadcast to all nodes. Of these broadcast messages, the first message received by each node is processed by its lock control circuit so that the lock request in that message is allowed to lock the resource managing node. The lock control circuit writes the number of the access requesting node into the register corresponding to the resource managing node. The access requesting node checks the register contents to see if the lock request it issued has been successfully accepted.
摘要:
A computer system including a plurality of processing nodes, at least one resource provided for use by any of the processing nodes and a plurality of register sets. Each register set is provided in each of the processing nodes for storing in parallel use status information indicating whether the resource is in exclusive use status or not. The computer system can also include a plurality of request issue circuits, each being provided in each of the processing nodes, for issuing individually requests for exclusive use of the resource, a message exchanging circuit for serializing requests issued by the request issue circuits into a serialized order and broadcasting the request to all of the processing nodes in the serialized order and a plurality of status control circuits. Each status control circuit is provided in each of the processing nodes corresponding to each of the register sets to update individually a corresponding register set depending on use status information stored in the corresponding register set and each of the requests for exclusive use of the resource received at a corresponding node.
摘要:
To reduce an overhead of the interrupt on a processor associated with packet send and receive control in a network, a packet send command chaining unit is provided. Based on the control field in each packet send command, a send node controls an interrupt request to the processor in the packet level and sends a packet set with the control information to a receive node. Based on the control field in the received data packet, the receive node controls a receive circuit interrupt request, thereby reducing the number of times the interrupt on the instruction processor is caused for each packet send and receive operation.