Message passing with a limited number of DMA byte counters
    92.
    发明授权
    Message passing with a limited number of DMA byte counters 失效
    消息传递有限数量的DMA字节计数器

    公开(公告)号:US08032892B2

    公开(公告)日:2011-10-04

    申请号:US11768813

    申请日:2007-06-26

    CPC分类号: G06F15/17356 G06F9/546

    摘要: A method for passing messages in a parallel computer system constructed as a plurality of compute nodes interconnected as a network where each compute node includes a DMA engine but includes only a limited number of byte counters for tracking a number of bytes that are sent or received by the DMA engine, where the byte counters may be used in shared counter or exclusive counter modes of operation. The method includes using rendezvous protocol, a source compute node deterministically sending a request to send (RTS) message with a single RTS descriptor using an exclusive injection counter to track both the RTS message and message data to be sent in association with the RTS message, to a destination compute node such that the RTS descriptor indicates to the destination compute node that the message data will be adaptively routed to the destination node. Using one DMA FIFO at the source compute node, the RTS descriptors are maintained for rendezvous messages destined for the destination compute node to ensure proper message data ordering thereat. Using a reception counter at a DMA engine, the destination compute node tracks reception of the RTS and associated message data and sends a clear to send (CTS) message to the source node in a rendezvous protocol form of a remote get to accept the RTS message and message data and processing the remote get (CTS) by the source compute node DMA engine to provide the message data to be sent.

    摘要翻译: 一种在并行计算机系统中传送消息的方法,该并行计算机系统被构造为作为网络互连的多个计算节点,其中每个计算节点包括DMA引擎,但是仅包括有限数量的字节计数器,用于跟踪由 DMA引擎,其中可以在共享计数器或专用计数器操作模式中使用字节计数器。 该方法包括使用会合协议,源计算节点使用专用注入计数器确定性地发送具有单个RTS描述符的请求(RTS)消息以跟踪要与RTS消息相关联地发送的RTS消息和消息数据, 到目的地计算节点,使得RTS描述符向目标计算节点指示消息数据将自适应地路由到目的地节点。 在源计算节点使用一个DMA FIFO,将为发往目的地计算节点的会合消息保留RTS描述符,以确保正确的消息数据顺序。 在DMA引擎上使用接收计数器,目的地计算节点跟踪RTS和相关联的消息数据的接收,并以远程获取的会合协议形式向源节点发送明确发送(CTS)消息以接受RTS消息 和消息数据,并由源计算节点DMA引擎处理远程获取(CTS)以提供要发送的消息数据。

    OPTIMIZED COLLECTIVES USING A DMA ON A PARALLEL COMPUTER
    93.
    发明申请
    OPTIMIZED COLLECTIVES USING A DMA ON A PARALLEL COMPUTER 有权
    使用并行计算机上的DMA的优化收集器

    公开(公告)号:US20090006662A1

    公开(公告)日:2009-01-01

    申请号:US11768645

    申请日:2007-06-26

    IPC分类号: G06F13/28

    CPC分类号: G06F13/28

    摘要: Optimizing collective operations using direct memory access controller on a parallel computer, in one aspect, may comprise establishing a byte counter associated with a direct memory access controller for each submessage in a message. The byte counter includes at least a base address of memory and a byte count associated with a submessage. A byte counter associated with a submessage is monitored to determine whether at least a block of data of the submessage has been received. The block of data has a predetermined size, for example, a number of bytes. The block is processed when the block has been fully received, for example, when the byte count indicates all bytes of the block have been received. The monitoring and processing may continue for all blocks in all submessages in the message.

    摘要翻译: 在一个方面,在并行计算机上优化使用直接存储器访问控制器的集合操作可以包括建立与消息中的每个子消息的直接存储器访问控制器相关联的字节计数器。 字节计数器至少包括存储器的基地址和与子消息相关联的字节计数。 监视与子消息相关联的字节计数器,以确定是否已经接收到子消息的至少一个数据块。 数据块具有预定的大小,例如,多个字节。 当块完全接收时,块被处理,例如,当字节计数指示已经接收到块的所有字节时。 消息中的所有子消息中的所有块的监视和处理可以继续。

    Direct Memory Access Transfer Completion Notification
    94.
    发明申请
    Direct Memory Access Transfer Completion Notification 失效
    直接内存访问传输完成通知

    公开(公告)号:US20080307121A1

    公开(公告)日:2008-12-11

    申请号:US11758167

    申请日:2007-06-05

    IPC分类号: G06F13/28

    CPC分类号: G06F13/28

    摘要: Methods, compute nodes, and computer program products are provided for direct memory access (‘DMA’) transfer completion notification. Embodiments include determining, by an origin DMA engine on an origin compute node, whether a data descriptor for an application message to be sent to a target compute node is currently in an injection first-in-first-out (‘FIFO’) buffer in dependence upon a sequence number previously associated with the data descriptor, the total number of descriptors currently in the injection FIFO buffer, and the current sequence number for the newest data descriptor stored in the injection FIFO buffer; and notifying a processor core on the origin DMA engine that the message has been sent if the data descriptor for the message is not currently in the injection FIFO buffer.

    摘要翻译: 提供方法,计算节点和计算机程序产品用于直接内存访问(“DMA”)传输完成通知。 实施例包括通过原始计算节点上的原始DMA引擎确定要发送到目标计算节点的应用消息的数据描述符当前是否处于先进先出先入先出(FIFO)缓冲器 依赖于先前与数据描述符相关联的序列号,当前在注入FIFO缓冲器中的描述符的总数以及存储在注入FIFO缓冲器中的最新数据描述符的当前序列号; 并且如果消息的数据描述符当前不在注入FIFO缓冲器中,则通知源DMA引擎上的处理器核心消息已被发送。

    Direct memory access transfer completion notification
    95.
    发明授权
    Direct memory access transfer completion notification 失效
    直接内存访问传输完成通知

    公开(公告)号:US07765337B2

    公开(公告)日:2010-07-27

    申请号:US11758167

    申请日:2007-06-05

    IPC分类号: G06F13/28 G06F3/00 G06F13/00

    CPC分类号: G06F13/28

    摘要: Methods, compute nodes, and computer program products are provided for direct memory access (‘DMA’) transfer completion notification. Embodiments include determining, by an origin DMA engine on an origin compute node, whether a data descriptor for an application message to be sent to a target compute node is currently in an injection first-in-first-out (‘FIFO’) buffer in dependence upon a sequence number previously associated with the data descriptor, the total number of descriptors currently in the injection FIFO buffer, and the current sequence number for the newest data descriptor stored in the injection FIFO buffer; and notifying a processor core on the origin DMA engine that the message has been sent if the data descriptor for the message is not currently in the injection FIFO buffer.

    摘要翻译: 提供方法,计算节点和计算机程序产品用于直接内存访问(“DMA”)传输完成通知。 实施例包括通过原始计算节点上的原始DMA引擎确定要发送到目标计算节点的应用消息的数据描述符当前是否处于先进先出先入先出(FIFO)缓冲器 依赖于先前与数据描述符相关联的序列号,当前在注入FIFO缓冲器中的描述符的总数以及存储在注入FIFO缓冲器中的最新数据描述符的当前序列号; 并且如果消息的数据描述符当前不在注入FIFO缓冲器中,则通知源DMA引擎上的处理器核心消息已被发送。

    Optimized collectives using a DMA on a parallel computer
    96.
    发明授权
    Optimized collectives using a DMA on a parallel computer 有权
    在并行计算机上使用DMA优化集合

    公开(公告)号:US07886084B2

    公开(公告)日:2011-02-08

    申请号:US11768645

    申请日:2007-06-26

    IPC分类号: G06F13/28 G06F13/00

    CPC分类号: G06F13/28

    摘要: Optimizing collective operations using direct memory access controller on a parallel computer, in one aspect, may comprise establishing a byte counter associated with a direct memory access controller for each submessage in a message. The byte counter includes at least a base address of memory and a byte count associated with a submessage. A byte counter associated with a submessage is monitored to determine whether at least a block of data of the submessage has been received. The block of data has a predetermined size, for example, a number of bytes. The block is processed when the block has been fully received, for example, when the byte count indicates all bytes of the block have been received. The monitoring and processing may continue for all blocks in all submessages in the message.

    摘要翻译: 在一个方面,在并行计算机上优化使用直接存储器访问控制器的集合操作可以包括建立与消息中的每个子消息的直接存储器访问控制器相关联的字节计数器。 字节计数器至少包括存储器的基地址和与子消息相关联的字节计数。 监视与子消息相关联的字节计数器,以确定是否已经接收到子消息的至少一个数据块。 数据块具有预定的大小,例如,多个字节。 当块完全接收时,块被处理,例如,当字节计数指示已经接收到块的所有字节时。 消息中的所有子消息中的所有块的监视和处理可以继续。

    MESSAGE PASSING WITH A LIMITED NUMBER OF DMA BYTE COUNTERS
    97.
    发明申请
    MESSAGE PASSING WITH A LIMITED NUMBER OF DMA BYTE COUNTERS 失效
    消息传递与有限数量的DMA字节计数器

    公开(公告)号:US20090007141A1

    公开(公告)日:2009-01-01

    申请号:US11768813

    申请日:2007-06-26

    IPC分类号: G06F9/44

    CPC分类号: G06F15/17356 G06F9/546

    摘要: A method for passing messages in a parallel computer system constructed as a plurality of compute nodes interconnected as a network where each compute node includes a DMA engine but includes only a limited number of byte counters for tracking a number of bytes that are sent or received by the DMA engine, where the byte counters may be used in shared counter or exclusive counter modes of operation. The method includes using rendezvous protocol, a source compute node deterministically sending a request to send (RTS) message with a single RTS descriptor using an exclusive injection counter to track both the RTS message and message data to be sent in association with the RTS message, to a destination compute node such that the RTS descriptor indicates to the destination compute node that the message data will be adaptively routed to the destination node. Using one DMA FIFO at the source compute node, the RTS descriptors are maintained for rendezvous messages destined for the destination compute node to ensure proper message data ordering thereat. Using a reception counter at a DMA engine, the destination compute node tracks reception of the RTS and associated message data and sends a clear to send (CTS) message to the source node in a rendezvous protocol form of a remote get to accept the RTS message and message data and processing the remote get (CTS) by the source compute node DMA engine to provide the message data to be sent.

    摘要翻译: 一种在并行计算机系统中传送消息的方法,该并行计算机系统被构造为作为网络互连的多个计算节点,其中每个计算节点包括DMA引擎,但是仅包括有限数量的字节计数器,用于跟踪由 DMA引擎,其中可以在共享计数器或专用计数器操作模式中使用字节计数器。 该方法包括使用会合协议,源计算节点使用专用注入计数器确定性地发送具有单个RTS描述符的请求(RTS)消息以跟踪要与RTS消息相关联地发送的RTS消息和消息数据, 到目的地计算节点,使得RTS描述符向目标计算节点指示消息数据将自适应地路由到目的地节点。 在源计算节点使用一个DMA FIFO,将为发往目的地计算节点的会合消息保留RTS描述符,以确保正确的消息数据顺序。 在DMA引擎上使用接收计数器,目的地计算节点跟踪RTS和相关联的消息数据的接收,并以远程获取的会合协议形式向源节点发送明确发送(CTS)消息以接受RTS消息 和消息数据,并由源计算节点DMA引擎处理远程获取(CTS)以提供要发送的消息数据。

    Multidimensional switch network
    98.
    发明申请
    Multidimensional switch network 失效
    多维交换机网络

    公开(公告)号:US20050195808A1

    公开(公告)日:2005-09-08

    申请号:US10793068

    申请日:2004-03-04

    IPC分类号: H04L12/26

    CPC分类号: H04L49/1576 H04L45/06

    摘要: Multidimensional switch data networks are disclosed, such as are used by a distributed-memory parallel computer, as applied for example to computations in the field of life sciences. A distributed memory parallel computing system comprises a number of parallel compute nodes and a message passing data network connecting the compute nodes together. The data network connecting the compute nodes comprises a multidimensional switch data network of compute nodes having N dimensions, and a number/array of compute nodes Ln in each of the N dimensions. Each compute node includes an N port routing element having a port for each of the N dimensions. Each compute node of an array of Ln compute nodes in each of the N dimensions connects through a port of its routing element to an Ln port crossbar switch having Ln ports. Several embodiments are disclosed of a 4 dimensional computing system having 65,536 compute nodes.

    摘要翻译: 公开了多维交换机数据网络,例如由分布式存储器并行计算机使用的,例如应用于生命科学领域的计算。 分布式存储器并行计算系统包括多个并行计算节点和将计算节点连接在一起的消息传递数据网络。 连接计算节点的数据网络包括具有N维的计算节点的多维交换机数据网络和N个维度中的每一个中的计算节点Ln的数量/数组。 每个计算节点包括具有用于N个维度中的每一个的端口的N端口路由元件。 每个N维中的Ln计算节点阵列的每个计算节点通过其路由元素的端口连接到具有Ln端口的Ln端口交叉开关。 公开了具有65,536个计算节点的四维计算系统的几个实施例。

    Methods and apparatus using commutative error detection values for fault isolation in multiple node computers
    99.
    发明申请
    Methods and apparatus using commutative error detection values for fault isolation in multiple node computers 失效
    使用多节点计算机故障隔离交换误差检测值的方法和装置

    公开(公告)号:US20060248370A1

    公开(公告)日:2006-11-02

    申请号:US11106069

    申请日:2005-04-14

    IPC分类号: G06F11/00

    CPC分类号: G06F11/1633

    摘要: The present invention concerns methods and apparatus for performing fault isolation in multiple node computing systems using commutative error detection values—for example, checksums—to identify and to isolate faulty nodes. In the present invention nodes forming the multiple node computing system are networked together and during program execution communicate with one another by transmitting information through the network. When information associated with a reproducible portion of a computer program is injected into the network by a node, a commutative error detection value is calculated and stored in commutative error detection apparatus associated with the node. At intervals, node fault detection apparatus associated with the multiple node computer system retrieve commutative error detection values saved in the commutative error detection apparatus associated with the node and stores them in memory. When the computer program is executed again by the multiple node computer system, new commutative error detection values are created; the node fault detection apparatus retrieves them and stores them in memory. The node fault detection apparatus identifies faulty nodes by comparing commutative error detection values associated with reproducible portions of the application program generated by a particular node from different runs of the application program. Differences in commutative error detection values indicate that the node may be faulty.

    摘要翻译: 本发明涉及在多节点计算系统中使用交换性错误检测值(例如校验和)识别和隔离故障节点来执行故障隔离的方法和装置。 在本发明中,形成多节点计算系统的节点被联网在一起,并且在程序执行期间通过网络传送信息彼此通信。 当与计算机程序的可再现部分相关联的信息被节点注入到网络中时,计算交换性错误检测值并将其存储在与节点相关联的交换错误检测装置中。 间歇地,与多节点计算机系统相关联的节点故障检测装置检索保存在与节点相关联的交换性错误检测装置中的交换性错误检测值,并将其存储在存储器中。 当多节点计算机系统再次执行计算机程序时,创建新的交换错误检测值; 节点故障检测装置检索它们并将其存储在存储器中。 节点故障检测装置通过比较与来自应用程序的不同运行的特定节点生成的应用程序的可再现部分相关联的交换错误检测值来识别故障节点。 交换性错误检测值的差异表明节点可能有故障。

    Multiprocessor system with multiple concurrent modes of execution
    100.
    发明授权
    Multiprocessor system with multiple concurrent modes of execution 有权
    具有多个并发执行模式的多处理器系统

    公开(公告)号:US08621478B2

    公开(公告)日:2013-12-31

    申请号:US13008502

    申请日:2011-01-18

    IPC分类号: G06F9/46

    CPC分类号: G06F9/524 G06F12/08

    摘要: A multiprocessor system supports multiple concurrent modes of speculative execution. Speculation identification numbers (IDs) are allocated to speculative threads from a pool of available numbers. The pool is divided into domains, with each domain being assigned to a mode of speculation. Modes of speculation include TM, TLS, and rollback. Allocation of the IDs is carried out with respect to a central state table and using hardware pointers. The IDs are used for writing different versions of speculative results in different ways of a set in a cache memory.

    摘要翻译: 多处理器系统支持多种并发模式的推测执行。 投机标识号(ID)从可用数字池中分配给投机线程。 池被分为域,每个域被分配到一种投机模式。 投机模式包括TM,TLS和回滚。 对于中央状态表并使用硬件指针执行ID的分配。 ID用于以高速缓冲存储器中的集合的不同方式写入不同版本的推测结果。