摘要:
A system and method of accessing a memory location within a system having a processor and a plurality of memory locations separate from the processor. The system includes a plurality of external registers which are connected to the processor over a data bus, address translation means, connected to the processor over the data bus and an address bus, for calculating, based on an index written to the data bus, an address associated with one of the memory locations, and transfer means, connected to the plurality of external registers, for transferring data between the addressed memory location and one of the external registers.
摘要:
A messaging facility in a multiprocessor computer system includes assembly circuitry in a source processing element for assembling a message to be sent from the source processing element to a destination processing element based on information provided from a processor in the source processing element. A network router transmits the assembled message from the source processing element to the destination processing element via an interconnect network. A message queue in a local memory of the destination processing element stores the transmitted message. A control word stored in the local memory of the destination processing element includes a limit field designating a size of the message queue and a tail field designating an index into the corresponding message queue to indicate a location in the message queue where the transmitted message is to be stored. Shell circuitry in the destination processing element atomically reads and updates the tail field.
摘要:
Method and apparatus for a filtered stream buffer coupled to a memory and a processor, and operating to prefetch data from the memory. The filtered stream buffer includes a cache block storage area and a filter controller. The filter controller determines whether a pattern of references has a predetermined relationship, and if so, prefetches stream data into the cache block storage area. Such stream data prefetches are particularly useful in vector processing computers, where once the processor starts to fetch a vector, the addresses of future fetches can be predicted based in the pattern of past fetches. According to various aspects of the present invention, the filtered stream buffer further includes a history table, a validity indicator which is associated with the cache block storage area and indicates which cache blocks, if any, are valid. According to yet another aspect of the present invention, the filtered stream buffer controls random access memory (RAM) chips to stream the plurality of consecutive cache blocks from the RAM into the cache block storage area. According to yet another aspect of the present invention, the stream data includes data for a plurality of strided cache blocks, wherein each of which these strided cache blocks corresponds to an address determined by adding to the first address an integer multiple of the difference between the second address and the first address. According to yet another aspect of the present invention, the processor generates three addresses of data words in the memory, and the filter controller determines whether a predetermined relationship exists among three addresses, and if so, prefetches strided stream data into said cache block storage area.
摘要:
A system and method of accessing a memory location within a system having a processor and a plurality of memory locations separate from the processor. The system includes a plurality of external registers which are connected to the processor over a data bus, address translation means, connected to the processor over the data bus and an address bus, for calculating, based on an index written to the data bus, an address associated with one of the memory locations, and transfer means, connected to the plurality of external registers, for transferring data between the addressed memory location and one of the external registers.
摘要:
A system and method of transferring information between a peripheral device and an MPP system having an interconnect network and a plurality of processing nodes. Each processing element includes a processor, local memory and a router circuit connected to the interconnect network, the processor and the local memory. Each router circuit includes means for transferring data between the processor and the interconnect network and means for transferring data between the local memory and the interconnect network. An I/O controller is connected to a plurality of the router circuits. Data is then read from the peripheral device and transferred through the I/O controller to local memory of one of the processing elements.
摘要翻译:在具有互连网络和多个处理节点的外围设备和MPP系统之间传送信息的系统和方法。 每个处理元件包括处理器,本地存储器和连接到互连网络的路由器电路,处理器和本地存储器。 每个路由器电路包括用于在处理器和互连网络之间传送数据的装置和用于在本地存储器和互连网络之间传送数据的装置。 I / O控制器连接到多个路由器电路。 然后从外围设备读取数据,并通过I / O控制器传送到其中一个处理元件的本地存储器。
摘要:
A multidimensional interconnection and routing apparatus for a parallel processing computer connects together processing elements in a three-dimensional structure. The interconnection and routing apparatus includes a plurality of processing element nodes. A communication connects at least one of the processing elements with a host system. An interconnection network connects together the processing element nodes in an X, Y, and Z dimension. The network includes communication paths connecting each of the plurality of processing elements to adjacent processing elements in the plus and minus directions of each of the X, Y, and Z dimensions.
摘要:
A multidimensional interconnection and routing apparatus for a parallel processing computer connects together possessing elements in a three-dimensional structure. The interconnection and routing apparatus includes a plurality of processing element nodes. A communication connects at least one of the processing elements with a host system. An interconnection network connects together the processing element nodes in an X, y, and Z dimension. The network includes communication paths connecting each of the plurality of processing elements to adjacent processing elements in the plus and minus directions of each of the X, Y, and Z dimensions.
摘要:
Method and apparatus for facilitating barrier and eureka synchronization in a massively parallel processing system. The present barrier/eureka mechanism provides a partitionable, low-latency, immediately reusable, robust mechanism which can operate on a physical data-communications network and can be used to alert all processor entities (PEs) in a partition when all of the PEs in that partition have reached a designated barrier point in their individual program code, or when any one of the PEs in that partition has reached a designated eureka point in its individual program code, or when either the barrier or eureka requirements have been satisfied, which ever comes first. Multiple overlapping barrier/eureka synchronization partitions are available simultaneously through the use of a plurality of parallel barrier/eureka synchronization domains. The present barrier/eureka mechanism may be implemented on either a dedicated barrier network, or superimposed as a virtual barrier/eureka network operating on a physical data-communications network which is also used for data interchange, operating system functions, and other purposes.
摘要:
A multidimensional interconnection and routing apparatus for a parallel processing computer connects together processing elements in a three-dimensional structure. The interconnection and routing apparatus includes a plurality of processing element nodes. A communication connects at least one of the processing elements with a host system. An interconnection network connects together the processing element nodes in an X, Y, and Z dimension. The network includes communication paths connecting each of the plurality of processing elements to adjacent processing elements in the plus and minus directions of each of the X, Y, and Z dimensions.
摘要:
A multiple counter-rotating ring computer network system having a permission control scheme for client isolation. The peripheral channel allows two rings to be folded into one longer ring so that faulty nodes can be effectively removed from the network. Or, any of the rings can be masked so that they are unoperational. The network system also allows several client isolation states ranging from complete isolation to master access. These types of isolation allow faulty client devices to be tested while maintaining a high-level of network security by configuring the client to an appropriate isolation state.