摘要:
In accordance with the present invention, system architecture and programming are in accordance with a bulk-synchronous parallel processing model. Data is distributed to memory elements through a hashing function performed in individual hardware modules associated with computational elements. The router operates independently of the computational and memory elements and masks any substantial latency it may have by pipelining. A synchronizer provides for bulk synchronization in supersteps of multiple computational steps. The router bandwidth is balanced with that of the computational elements and the program may be compiled to a number of virtual processors significantly greater than the number of actual processors in the system.
摘要:
A multiprogrammed multiprocessor system comprises a plurality of processors and some communications resources such as networks through which the processors communicate with each other. A plurality of tasks may be executed on the system, and the allocation of the communications resources among the tasks is globally controlled. The allocation of resources among the tasks running on the system can be dependent on the signature of the tasks, where one component of a task signature is a measure of the communication resources needed by the task. The scheduling of a task running on the system may also be dependent on the signature of the task. The allocation of communications resources can be globally controlled using a variety of techniques including: packet injection into the communications resources using periodic strobing or using global flow control; using global implicit acknowledgments; by destination scheduling; by pacing; or by prioritized communication scheduling. Error recovery overheads can be amortized over a plurality of jobs running at one node. A user interface allows a plurality of service level options to be specified by a user, where the system can guarantee that the service levels can be achieved. Application users as well as system administrators can choose options as are appropriate. The user interface can allow the system administrator to run a scheduling mechanism that distributes communications resources among the tasks according to a market mechanism. The user interface can also allow a task to be guaranteed a fixed fraction of the resources independent of the other tasks then running or to be run as an interactive continuous job at one of a plurality of service levels. Finally, the user interface allows a system administrator to subdivide system resources into reserved and unreserved components, where the unreserved component is made available according to a market mechansim.
摘要:
Requests are routed between components in a parallel computing system using multiple-phase combining. In the first phase, the original requests are decomposed into groups of requests that share the same destination address. The requests in each group are combined at an intermediate component into a single request per group. In subsequent phases, the combined requests are themselves grouped and combined in intermediate components. In the final phase, the combined requests are processed by the component containing the destination address. The addresses of the intermediate components are determined in part by hashing on the destination address and in part by a distributing function. The hashed portion of the intermediate component address tends to converge the combined requests toward the destination component during each phase. The distributing portion of the intermediate component address tends to distribute the workload evenly among the components.