Pausing and activating thread state upon pin assertion by external logic monitoring polling loop exit time condition
    22.
    发明授权
    Pausing and activating thread state upon pin assertion by external logic monitoring polling loop exit time condition 失效
    通过外部逻辑监视轮询循环退出时间条件,在引脚断言时暂停和激活线程状态

    公开(公告)号:US08447960B2

    公开(公告)日:2013-05-21

    申请号:US12684860

    申请日:2010-01-08

    IPC分类号: G06F9/48

    CPC分类号: G06F9/30079 G06F9/3851

    摘要: A system and method for enhancing performance of a computer which includes a computer system including a data storage device. The computer system includes a program stored in the data storage device and steps of the program are executed by a processer. The processor processes instructions from the program. A wait state in the processor waits for receiving specified data. A thread in the processor has a pause state wherein the processor waits for specified data. A pin in the processor initiates a return to an active state from the pause state for the thread. A logic circuit is external to the processor, and the logic circuit is configured to detect a specified condition. The pin initiates a return to the active state of the thread when the specified condition is detected using the logic circuit.

    摘要翻译: 一种用于增强计算机性能的系统和方法,其包括包括数据存储装置的计算机系统。 计算机系统包括存储在数据存储装置中的程序,程序的步骤由处理器执行。 处理器处理来自程序的指令。 处理器中的等待状态等待接收指定的数据。 处理器中的线程具有暂停状态,其中处理器等待指定的数据。 处理器中的引脚从线程的暂停状态启动返回到活动状态。 逻辑电路在处理器外部,并且逻辑电路被配置为检测指定的条件。 当使用逻辑电路检测到指定的条件时,引脚启动返回到线程的活动状态。

    Method and apparatus for re-utilizing partially failed resources as network resources
    24.
    发明申请
    Method and apparatus for re-utilizing partially failed resources as network resources 失效
    将部分故障资源重新利用作为网络资源的方法和装置

    公开(公告)号:US20070168695A1

    公开(公告)日:2007-07-19

    申请号:US11335784

    申请日:2006-01-19

    IPC分类号: G06F11/00

    CPC分类号: G06F11/0793 G06F11/0724

    摘要: A method and apparatus for re-utilizing partially failed compute resources in a massively parallel super computer system. In the preferred embodiments the compute node comprises a number of clock domains that can be enabled separately. When an error in a compute node is detected, and the failure is not in network communication blocks, a clock enable circuit enables the clocks to the network communication blocks only to allow the partially failed compute node to be re-utilized as a network resource. The computer system can then continue to operate with only slightly diminished performance and thereby improve performance and perceived overall reliability.

    摘要翻译: 在大规模并行的超级计算机系统中重新利用部分失败的计算资源的方法和装置。 在优选实施例中,计算节点包括可以单独使能的多个时钟域。 当检测到计算节点中的错误,并且故障不在网络通信块中时,时钟使能电路仅允许网络通信块的时钟允许部分失败的计算节点被重新利用为网络资源。 然后,计算机系统可以继续操作,性能略有降低,从而提高性能和可察觉的整体可靠性。

    Multidimensional switch network
    25.
    发明申请
    Multidimensional switch network 失效
    多维交换机网络

    公开(公告)号:US20050195808A1

    公开(公告)日:2005-09-08

    申请号:US10793068

    申请日:2004-03-04

    IPC分类号: H04L12/26

    CPC分类号: H04L49/1576 H04L45/06

    摘要: Multidimensional switch data networks are disclosed, such as are used by a distributed-memory parallel computer, as applied for example to computations in the field of life sciences. A distributed memory parallel computing system comprises a number of parallel compute nodes and a message passing data network connecting the compute nodes together. The data network connecting the compute nodes comprises a multidimensional switch data network of compute nodes having N dimensions, and a number/array of compute nodes Ln in each of the N dimensions. Each compute node includes an N port routing element having a port for each of the N dimensions. Each compute node of an array of Ln compute nodes in each of the N dimensions connects through a port of its routing element to an Ln port crossbar switch having Ln ports. Several embodiments are disclosed of a 4 dimensional computing system having 65,536 compute nodes.

    摘要翻译: 公开了多维交换机数据网络,例如由分布式存储器并行计算机使用的,例如应用于生命科学领域的计算。 分布式存储器并行计算系统包括多个并行计算节点和将计算节点连接在一起的消息传递数据网络。 连接计算节点的数据网络包括具有N维的计算节点的多维交换机数据网络和N个维度中的每一个中的计算节点Ln的数量/数组。 每个计算节点包括具有用于N个维度中的每一个的端口的N端口路由元件。 每个N维中的Ln计算节点阵列的每个计算节点通过其路由元素的端口连接到具有Ln端口的Ln端口交叉开关。 公开了具有65,536个计算节点的四维计算系统的几个实施例。

    STORE-OPERATE-COHERENCE-ON-VALUE
    26.
    发明申请
    STORE-OPERATE-COHERENCE-ON-VALUE 有权
    存储操作相关值

    公开(公告)号:US20110179229A1

    公开(公告)日:2011-07-21

    申请号:US12986652

    申请日:2011-01-07

    IPC分类号: G06F12/08

    摘要: A system, method and computer program product for performing various store-operate instructions in a parallel computing environment that includes a plurality of processors and at least one cache memory device. A queue in the system receives, from a processor, a store-operate instruction that specifies under which condition a cache coherence operation is to be invoked. A hardware unit in the system runs the received store-operate instruction. The hardware unit evaluates whether a result of the running the received store-operate instruction satisfies the condition. The hardware unit invokes a cache coherence operation on a cache memory address associated with the received store-operate instruction if the result satisfies the condition. Otherwise, the hardware unit does not invoke the cache coherence operation on the cache memory device.

    摘要翻译: 一种用于在包括多个处理器和至少一个高速缓冲存储器设备的并行计算环境中执行各种存储操作指令的系统,方法和计算机程序产品。 系统中的队列从处理器接收存储操作指令,该指令指定在哪个条件下调用高速缓存一致性操作。 系统中的硬件单元运行接收到的存储操作指令。 硬件单元评估运行接收到的存储操作指令的结果是否满足条件。 如果结果满足条件,则硬件单元调用与接收到的存储操作指令相关联的高速缓存存储器地址的高速缓存一致性操作。 否则,硬件单元不会调用高速缓存存储器设备上的高速缓存一致性操作。

    PROCESSOR RESUME UNIT
    27.
    发明申请
    PROCESSOR RESUME UNIT 审中-公开
    处理器修复单元

    公开(公告)号:US20110173420A1

    公开(公告)日:2011-07-14

    申请号:US12684852

    申请日:2010-01-08

    IPC分类号: G06F9/30

    摘要: A system for enhancing performance of a computer includes a computer system having a data storage device. The computer system includes a program stored in the data storage device and steps of the program are executed by a processor. An external unit is external to the processor for monitoring specified computer resources. The external unit is configured to detect a specified condition using the processor. The processor including one or more threads. The thread resumes an active state from a pause state using the external unit when the specified condition is detected by the external unit.

    摘要翻译: 一种用于增强计算机性能的系统包括具有数据存储装置的计算机系统。 计算机系统包括存储在数据存储装置中的程序,并且程序的步骤由处理器执行。 处理器外部的外部单元用于监视指定的计算机资源。 外部单元配置为使用处理器检测指定的条件。 处理器包括一个或多个线程。 当外部单元检测到指定的条件时,线程将使用外部单元从暂停状态恢复活动状态。

    ATOMICITY: A MULTI-PRONGED APPROACH
    28.
    发明申请
    ATOMICITY: A MULTI-PRONGED APPROACH 审中-公开
    原理:多方面的方法

    公开(公告)号:US20110219215A1

    公开(公告)日:2011-09-08

    申请号:US13008546

    申请日:2011-01-18

    IPC分类号: G06F9/30

    CPC分类号: G06F9/524 G06F12/08

    摘要: In a multiprocessor system with speculative execution, atomicity can be approached in several fashions. One approach is to have atomic instructions that achieve multiple functions and are guaranteed to complete. Another approach is to have blocks of code that are grouped to succeed or fail together. A system can incorporate more than one such approach. In implementing more than one approach, the system may prioritize one over another. When conflict detection is done through a directory lookup in cache memory, atomic instructions and atomicity related operations may be implemented in a cache data array access pipeline in that cache memory. This implementation may include feedback to the pipeline for implementing multiple functions within an atomic instruction and also for cascading atomic instructions.

    摘要翻译: 在具有推测性执行的多处理器系统中,可以以几种方式逼近原子性。 一种方法是具有实现多种功能并保证完成的原子指令。 另一种方法是将代码块分组成一起成功或失败。 系统可以包含多种这样的方法。 在实施多种方法时,系统可以优先考虑其他方法。 当通过高速缓冲存储器中的目录查找完成冲突检测时,原子指令和原子性相关操作可以在该高速缓冲存储器中的高速缓存数据阵列访问流水线中实现。 该实现可以包括用于在原子指令内实现多个功能并且还用于级联原子指令的流水线的反馈。

    Re-utilizing partially failed resources as network resources
    29.
    发明授权
    Re-utilizing partially failed resources as network resources 失效
    重新利用部分失败的资源作为网络资源

    公开(公告)号:US07620841B2

    公开(公告)日:2009-11-17

    申请号:US11335784

    申请日:2006-01-19

    IPC分类号: G06F11/00

    CPC分类号: G06F11/0793 G06F11/0724

    摘要: A method and apparatus for re-utilizing partially failed compute resources in a massively parallel super computer system. In the preferred embodiments the compute node comprises a number of clock domains that can be enabled separately. When an error in a compute node is detected, and the failure is not in network communication blocks, a clock enable circuit enables the clocks to the network communication blocks only to allow the partially failed compute node to be re-utilized as a network resource. The computer system can then continue to operate with only slightly diminished performance and thereby improve performance and perceived overall reliability.

    摘要翻译: 在大规模并行的超级计算机系统中重新利用部分失败的计算资源的方法和装置。 在优选实施例中,计算节点包括可以单独使能的多个时钟域。 当检测到计算节点中的错误,并且故障不在网络通信块中时,时钟使能电路仅允许网络通信块的时钟允许部分失败的计算节点被重新利用为网络资源。 然后,计算机系统可以继续操作,性能略有降低,从而提高性能和可察觉的整体可靠性。

    Methods and apparatus using commutative error detection values for fault isolation in multiple node computers
    30.
    发明申请
    Methods and apparatus using commutative error detection values for fault isolation in multiple node computers 失效
    使用多节点计算机故障隔离交换误差检测值的方法和装置

    公开(公告)号:US20060248370A1

    公开(公告)日:2006-11-02

    申请号:US11106069

    申请日:2005-04-14

    IPC分类号: G06F11/00

    CPC分类号: G06F11/1633

    摘要: The present invention concerns methods and apparatus for performing fault isolation in multiple node computing systems using commutative error detection values—for example, checksums—to identify and to isolate faulty nodes. In the present invention nodes forming the multiple node computing system are networked together and during program execution communicate with one another by transmitting information through the network. When information associated with a reproducible portion of a computer program is injected into the network by a node, a commutative error detection value is calculated and stored in commutative error detection apparatus associated with the node. At intervals, node fault detection apparatus associated with the multiple node computer system retrieve commutative error detection values saved in the commutative error detection apparatus associated with the node and stores them in memory. When the computer program is executed again by the multiple node computer system, new commutative error detection values are created; the node fault detection apparatus retrieves them and stores them in memory. The node fault detection apparatus identifies faulty nodes by comparing commutative error detection values associated with reproducible portions of the application program generated by a particular node from different runs of the application program. Differences in commutative error detection values indicate that the node may be faulty.

    摘要翻译: 本发明涉及在多节点计算系统中使用交换性错误检测值(例如校验和)识别和隔离故障节点来执行故障隔离的方法和装置。 在本发明中,形成多节点计算系统的节点被联网在一起,并且在程序执行期间通过网络传送信息彼此通信。 当与计算机程序的可再现部分相关联的信息被节点注入到网络中时,计算交换性错误检测值并将其存储在与节点相关联的交换错误检测装置中。 间歇地,与多节点计算机系统相关联的节点故障检测装置检索保存在与节点相关联的交换性错误检测装置中的交换性错误检测值,并将其存储在存储器中。 当多节点计算机系统再次执行计算机程序时,创建新的交换错误检测值; 节点故障检测装置检索它们并将其存储在存储器中。 节点故障检测装置通过比较与来自应用程序的不同运行的特定节点生成的应用程序的可再现部分相关联的交换错误检测值来识别故障节点。 交换性错误检测值的差异表明节点可能有故障。