摘要:
A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time and supports DMA functionality allowing for parallel processing message-passing.
摘要:
Methods, apparatuses, and computer program products for servicing a globally broadcast interrupt signal in a multi-threaded computer comprising a plurality of processor threads. Embodiments include an interrupt controller indicating in a plurality of local interrupt status locations that a globally broadcast interrupt signal has been received by the interrupt controller. Embodiments also include a thread determining that a local interrupt status location corresponding to the thread indicates that the globally broadcast interrupt signal has been received by the interrupt controller. Embodiments also include the thread processing one or more entries in a global interrupt status bit queue based on whether global interrupt status bits associated with the globally broadcast interrupt signal are locked. Each entry in the global interrupt status bit queue corresponds to a queued global interrupt.
摘要:
Methods, systems and computer program products are disclosed for measuring a performance of a program running on a processing unit of a processing system. In one embodiment, the method comprises informing a logic unit of each instruction in the program that is executed by the processing unit, assigning a weight to each instruction, assigning the instructions to a plurality of groups, and analyzing the plurality of groups to measure one or more metrics. In one embodiment, each instruction includes an operating code portion, and the assigning includes assigning the instructions to the groups based on the operating code portions of the instructions. In an embodiment, each type of instruction is assigned to a respective one of the plurality of groups. These groups may be combined into a plurality of sets of the groups.
摘要:
Methods, systems and computer program products are disclosed for measuring a performance of a program running on a processing unit of a processing system. In one embodiment, the method comprises informing a logic unit of each instruction in the program that is executed by the processing unit, assigning a weight to each instruction, assigning the instructions to a plurality of groups, and analyzing the plurality of groups to measure one or more metrics. In one embodiment, each instruction includes an operating code portion, and the assigning includes assigning the instructions to the groups based on the operating code portions of the instructions. In an embodiment, each type of instruction is assigned to a respective one of the plurality of groups. These groups may be combined into a plurality of sets of the groups.
摘要:
Methods, systems and computer program products are disclosed for measuring a performance of a program running on a processing unit of a processing system. In one embodiment, the method comprises informing a logic unit of each instruction in the program that is executed by the processing unit, assigning a weight to each instruction, assigning the instructions to a plurality of groups, and analyzing the plurality of groups to measure one or more metrics. In one embodiment, each instruction includes an operating code portion, and the assigning includes assigning the instructions to the groups based on the operating code portions of the instructions. In an embodiment, each type of instruction is assigned to a respective one of the plurality of groups. These groups may be combined into a plurality of sets of the groups.
摘要:
Methods, systems and computer program products are disclosed for measuring a performance of a program running on a processing unit of a processing system. In one embodiment, the method comprises informing a logic unit of each instruction in the program that is executed by the processing unit, assigning a weight to each instruction, assigning the instructions to a plurality of groups, and analyzing the plurality of groups to measure one or more metrics. In one embodiment, each instruction includes an operating code portion, and the assigning includes assigning the instructions to the groups based on the operating code portions of the instructions. In an embodiment, each type of instruction is assigned to a respective one of the plurality of groups. These groups may be combined into a plurality of sets of the groups.
摘要:
Methods, apparatuses, and computer program products for servicing a globally broadcast interrupt signal in a multi-threaded computer comprising a plurality of processor threads. Embodiments include an interrupt controller indicating in a plurality of local interrupt status locations that a globally broadcast interrupt signal has been received by the interrupt controller. Embodiments also include a thread determining that a local interrupt status location corresponding to the thread indicates that the globally broadcast interrupt signal has been received by the interrupt controller. Embodiments also include the thread processing one or more entries in a global interrupt status bit queue based on whether global interrupt status bits associated with the globally broadcast interrupt signal are locked. Each entry in the global interrupt status bit queue corresponds to a queued global interrupt.
摘要:
Methods, systems and computer program products are disclosed for measuring a performance of a program running on a processing unit of a processing system. In one embodiment, the method comprises informing a logic unit of each instruction in the program that is executed by the processing unit, assigning a weight to each instruction, assigning the instructions to a plurality of groups, and analyzing the plurality of groups to measure one or more metrics. In one embodiment, each instruction includes an operating code portion, and the assigning includes assigning the instructions to the groups based on the operating code portions of the instructions. In an embodiment, each type of instruction is assigned to a respective one of the plurality of groups. These groups may be combined into a plurality of sets of the groups.
摘要:
Methods, systems and computer program products are disclosed for measuring a performance of a program running on a processing unit of a processing system. In one embodiment, the method comprises informing a logic unit of each instruction in the program that is executed by the processing unit, assigning a weight to each instruction, assigning the instructions to a plurality of groups, and analyzing the plurality of groups to measure one or more metrics. In one embodiment, each instruction includes an operating code portion, and the assigning includes assigning the instructions to the groups based on the operating code portions of the instructions. In an embodiment, each type of instruction is assigned to a respective one of the plurality of groups. These groups may be combined into a plurality of sets of the groups.
摘要:
Methods, apparatuses, and computer program products for servicing a globally broadcast interrupt signal in a multi-threaded computer comprising a plurality of processor threads. Embodiments include an interrupt controller indicating in a plurality of local interrupt status locations that a globally broadcast interrupt signal has been received by the interrupt controller. Embodiments also include a thread determining that a local interrupt status location corresponding to the thread indicates that the globally broadcast interrupt signal has been received by the interrupt controller. Embodiments also include the thread processing one or more entries in a global interrupt status bit queue based on whether global interrupt status bits associated with the globally broadcast interrupt signal are locked. Each entry in the global interrupt status bit queue corresponds to a queued global interrupt.