Message passing with a limited number of DMA byte counters
    23.
    发明授权
    Message passing with a limited number of DMA byte counters 失效
    消息传递有限数量的DMA字节计数器

    公开(公告)号:US08032892B2

    公开(公告)日:2011-10-04

    申请号:US11768813

    申请日:2007-06-26

    CPC分类号: G06F15/17356 G06F9/546

    摘要: A method for passing messages in a parallel computer system constructed as a plurality of compute nodes interconnected as a network where each compute node includes a DMA engine but includes only a limited number of byte counters for tracking a number of bytes that are sent or received by the DMA engine, where the byte counters may be used in shared counter or exclusive counter modes of operation. The method includes using rendezvous protocol, a source compute node deterministically sending a request to send (RTS) message with a single RTS descriptor using an exclusive injection counter to track both the RTS message and message data to be sent in association with the RTS message, to a destination compute node such that the RTS descriptor indicates to the destination compute node that the message data will be adaptively routed to the destination node. Using one DMA FIFO at the source compute node, the RTS descriptors are maintained for rendezvous messages destined for the destination compute node to ensure proper message data ordering thereat. Using a reception counter at a DMA engine, the destination compute node tracks reception of the RTS and associated message data and sends a clear to send (CTS) message to the source node in a rendezvous protocol form of a remote get to accept the RTS message and message data and processing the remote get (CTS) by the source compute node DMA engine to provide the message data to be sent.

    摘要翻译: 一种在并行计算机系统中传送消息的方法,该并行计算机系统被构造为作为网络互连的多个计算节点,其中每个计算节点包括DMA引擎,但是仅包括有限数量的字节计数器,用于跟踪由 DMA引擎,其中可以在共享计数器或专用计数器操作模式中使用字节计数器。 该方法包括使用会合协议,源计算节点使用专用注入计数器确定性地发送具有单个RTS描述符的请求(RTS)消息以跟踪要与RTS消息相关联地发送的RTS消息和消息数据, 到目的地计算节点,使得RTS描述符向目标计算节点指示消息数据将自适应地路由到目的地节点。 在源计算节点使用一个DMA FIFO,将为发往目的地计算节点的会合消息保留RTS描述符,以确保正确的消息数据顺序。 在DMA引擎上使用接收计数器,目的地计算节点跟踪RTS和相关联的消息数据的接收,并以远程获取的会合协议形式向源节点发送明确发送(CTS)消息以接受RTS消息 和消息数据,并由源计算节点DMA引擎处理远程获取(CTS)以提供要发送的消息数据。

    MESSAGE PASSING WITH A LIMITED NUMBER OF DMA BYTE COUNTERS
    25.
    发明申请
    MESSAGE PASSING WITH A LIMITED NUMBER OF DMA BYTE COUNTERS 失效
    消息传递与有限数量的DMA字节计数器

    公开(公告)号:US20090007141A1

    公开(公告)日:2009-01-01

    申请号:US11768813

    申请日:2007-06-26

    IPC分类号: G06F9/44

    CPC分类号: G06F15/17356 G06F9/546

    摘要: A method for passing messages in a parallel computer system constructed as a plurality of compute nodes interconnected as a network where each compute node includes a DMA engine but includes only a limited number of byte counters for tracking a number of bytes that are sent or received by the DMA engine, where the byte counters may be used in shared counter or exclusive counter modes of operation. The method includes using rendezvous protocol, a source compute node deterministically sending a request to send (RTS) message with a single RTS descriptor using an exclusive injection counter to track both the RTS message and message data to be sent in association with the RTS message, to a destination compute node such that the RTS descriptor indicates to the destination compute node that the message data will be adaptively routed to the destination node. Using one DMA FIFO at the source compute node, the RTS descriptors are maintained for rendezvous messages destined for the destination compute node to ensure proper message data ordering thereat. Using a reception counter at a DMA engine, the destination compute node tracks reception of the RTS and associated message data and sends a clear to send (CTS) message to the source node in a rendezvous protocol form of a remote get to accept the RTS message and message data and processing the remote get (CTS) by the source compute node DMA engine to provide the message data to be sent.

    摘要翻译: 一种在并行计算机系统中传送消息的方法,该并行计算机系统被构造为作为网络互连的多个计算节点,其中每个计算节点包括DMA引擎,但是仅包括有限数量的字节计数器,用于跟踪由 DMA引擎,其中可以在共享计数器或专用计数器操作模式中使用字节计数器。 该方法包括使用会合协议,源计算节点使用专用注入计数器确定性地发送具有单个RTS描述符的请求(RTS)消息以跟踪要与RTS消息相关联地发送的RTS消息和消息数据, 到目的地计算节点,使得RTS描述符向目标计算节点指示消息数据将自适应地路由到目的地节点。 在源计算节点使用一个DMA FIFO,将为发往目的地计算节点的会合消息保留RTS描述符,以确保正确的消息数据顺序。 在DMA引擎上使用接收计数器,目的地计算节点跟踪RTS和相关联的消息数据的接收,并以远程获取的会合协议形式向源节点发送明确发送(CTS)消息以接受RTS消息 和消息数据,并由源计算节点DMA引擎处理远程获取(CTS)以提供要发送的消息数据。

    METHOD AND APPARATUS FOR EFFICIENTLY TRACKING QUEUE ENTRIES RELATIVE TO A TIMESTAMP
    26.
    发明申请
    METHOD AND APPARATUS FOR EFFICIENTLY TRACKING QUEUE ENTRIES RELATIVE TO A TIMESTAMP 失效
    有效跟踪与TIMESTAMP相关的队列的方法和设备

    公开(公告)号:US20090006672A1

    公开(公告)日:2009-01-01

    申请号:US11768800

    申请日:2007-06-26

    IPC分类号: G06F3/00 G06F1/04

    CPC分类号: G06F12/0835 G06F12/0831

    摘要: An apparatus and method for tracking coherence event signals transmitted in a multiprocessor system. The apparatus comprises a coherence logic unit, each unit having a plurality of queue structures with each queue structure associated with a respective sender of event signals transmitted in the system. A timing circuit associated with a queue structure controls enqueuing and dequeuing of received coherence event signals, and, a counter tracks a number of coherence event signals remaining enqueued in the queue structure and dequeued since receipt of a timestamp signal. A counter mechanism generates an output signal indicating that all of the coherence event signals present in the queue structure at the time of receipt of the timestamp signal have been dequeued. In one embodiment, the timestamp signal is asserted at the start of a memory synchronization operation and, the output signal indicates that all coherence events present when the timestamp signal was asserted have completed. This signal can then be used as part of the completion condition for the memory synchronization operation.

    摘要翻译: 一种用于跟踪在多处理器系统中发送的相干事件信号的装置和方法。 该装置包括相干逻辑单元,每个单元具有多个队列结构,每个队列结构与在系统中传输的事件信号的相应发送者相关联。 与队列结构相关联的定时电路控制接收的相干事件信号的排队和出队,并且计数器跟踪队列结构中剩余入队的多个相干事件信号,并且从接收到时间戳信号起出队。 计数器机构产生一个输出信号,指示在接收时间戳信号时存在于队列结构中的所有相干事件信号已经出队。 在一个实施例中,时间戳信号在存储器同步操作的开始被断言,并且输出信号指示当时间戳信号被断言时存在的所有相干事件已经完成。 然后可以将该信号用作存储器同步操作的完成条件的一部分。

    DMA ENGINE FOR REPEATING COMMUNICATION PATTERNS
    27.
    发明申请
    DMA ENGINE FOR REPEATING COMMUNICATION PATTERNS 失效
    DMA引擎重复通信模式

    公开(公告)号:US20090006296A1

    公开(公告)日:2009-01-01

    申请号:US11768795

    申请日:2007-06-26

    IPC分类号: G06F15/18

    CPC分类号: G06F15/163

    摘要: A parallel computer system is constructed as a network of interconnected compute nodes to operate a global message-passing application for performing communications across the network. Each of the compute nodes includes one or more individual processors with memories which run local instances of the global message-passing application operating at each compute node to carry out local processing operations independent of processing operations carried out at other compute nodes. Each compute node also includes a DMA engine constructed to interact with the application via Injection FIFO Metadata describing multiple Injection FIFOs where each Injection FIFO may containing an arbitrary number of message descriptors in order to process messages with a fixed processing overhead irrespective of the number of message descriptors included in the Injection FIFO.

    摘要翻译: 并行计算机系统被构造为互连的计算节点的网络,以操作用于在整个网络上执行通信的全局消息传递应用。 每个计算节点包括具有存储器的一个或多个单独处理器,该存储器运行在每个计算节点处操作的全局消息传递应用的本地实例,以独立于在其他计算节点执行的处理操作来执行本地处理操作。 每个计算节点还包括构造成通过描述多个注入FIFO的注入FIFO元数据与应用交互的DMA引擎,其中每个注入FIFO可以包含任意数量的消息描述符,以便处理具有固定处理开销的消息,而不管消息的数量 描述符包含在注入FIFO中。

    Method and apparatus for efficiently tracking queue entries relative to a timestamp
    28.
    发明授权
    Method and apparatus for efficiently tracking queue entries relative to a timestamp 失效
    相对于时间戳有效跟踪队列条目的方法和装置

    公开(公告)号:US08756350B2

    公开(公告)日:2014-06-17

    申请号:US11768800

    申请日:2007-06-26

    IPC分类号: G06F3/00 G06F5/00

    CPC分类号: G06F12/0835 G06F12/0831

    摘要: An apparatus and method for tracking coherence event signals transmitted in a multiprocessor system. The apparatus comprises a coherence logic unit, each unit having a plurality of queue structures with each queue structure associated with a respective sender of event signals transmitted in the system. A timing circuit associated with a queue structure controls enqueuing and dequeuing of received coherence event signals, and, a counter tracks a number of coherence event signals remaining enqueued in the queue structure and dequeued since receipt of a timestamp signal. A counter mechanism generates an output signal indicating that all of the coherence event signals present in the queue structure at the time of receipt of the timestamp signal have been dequeued. In one embodiment, the timestamp signal is asserted at the start of a memory synchronization operation and, the output signal indicates that all coherence events present when the timestamp signal was asserted have completed. This signal can then be used as part of the completion condition for the memory synchronization operation.

    摘要翻译: 一种用于跟踪在多处理器系统中发送的相干事件信号的装置和方法。 该装置包括相干逻辑单元,每个单元具有多个队列结构,每个队列结构与在系统中传输的事件信号的相应发送者相关联。 与队列结构相关联的定时电路控制接收的相干事件信号的排队和出队,并且计数器跟踪队列结构中剩余入队的多个相干事件信号,并且从接收到时间戳信号起出队。 计数器机构产生一个输出信号,指示在接收时间戳信号时存在于队列结构中的所有相干事件信号已经出队。 在一个实施例中,时间戳信号在存储器同步操作的开始被断言,并且输出信号指示当时间戳信号被断言时存在的所有相干事件已经完成。 然后可以将该信号用作存储器同步操作的完成条件的一部分。

    MULTIPLE NODE REMOTE MESSAGING
    29.
    发明申请
    MULTIPLE NODE REMOTE MESSAGING 有权
    多个节点远程消息传递

    公开(公告)号:US20090006546A1

    公开(公告)日:2009-01-01

    申请号:US11768784

    申请日:2007-06-26

    IPC分类号: G06F15/16

    CPC分类号: G06F15/16

    摘要: A method for passing remote messages in a parallel computer system formed as a network of interconnected compute nodes includes that a first compute node (A) sends a single remote message to a remote second compute node (B) in order to control the remote second compute node (B) to send at least one remote message. The method includes various steps including controlling a DMA engine at first compute node (A) to prepare the single remote message to include a first message descriptor and at least one remote message descriptor for controlling the remote second compute node (B) to send at least one remote message, including putting the first message descriptor into an injection FIFO at the first compute node (A) and sending the single remote message and the at least one remote message descriptor to the second compute node (B).

    摘要翻译: 在形成为互连的计算节点的网络的并行计算机系统中传递远程消息的方法包括:第一计算节点(A)将单个远程消息发送到远程第二计算节点(B),以便控制远程第二计算 节点(B)发送至少一个远程消息。 该方法包括各种步骤,包括在第一计算节点(A)处控制DMA引擎以准备单个远程消息以包括第一消息描述符和至少一个远程消息描述符,用于控制远程第二计算节点(B)至少发送 一个远程消息,包括将第一消息描述符放在第一计算节点(A)的注入FIFO中,并将单个远程消息和至少一个远程消息描述符发送到第二计算节点(B)。

    DMA engine for repeating communication patterns
    30.
    发明授权
    DMA engine for repeating communication patterns 失效
    用于重复通信模式的DMA引擎

    公开(公告)号:US07802025B2

    公开(公告)日:2010-09-21

    申请号:US11768795

    申请日:2007-06-26

    IPC分类号: G06F13/28

    CPC分类号: G06F15/163

    摘要: A parallel computer system is constructed as a network of interconnected compute nodes to operate a global message-passing application for performing communications across the network. Each of the compute nodes includes one or more individual processors with memories which run local instances of the global message-passing application operating at each compute node to carry out local processing operations independent of processing operations carried out at other compute nodes. Each compute node also includes a DMA engine constructed to interact with the application via Injection FIFO Metadata describing multiple Injection FIFOs where each Injection FIFO may containing an arbitrary number of message descriptors in order to process messages with a fixed processing overhead irrespective of the number of message descriptors included in the Injection FIFO.

    摘要翻译: 并行计算机系统被构造为互连的计算节点的网络,以操作用于在整个网络上执行通信的全局消息传递应用。 每个计算节点包括具有存储器的一个或多个单独处理器,该存储器运行在每个计算节点处操作的全局消息传递应用的本地实例,以独立于在其他计算节点执行的处理操作来执行本地处理操作。 每个计算节点还包括构造成通过描述多个注入FIFO的注入FIFO元数据与应用交互的DMA引擎,其中每个注入FIFO可以包含任意数量的消息描述符,以便处理具有固定处理开销的消息,而不管消息的数量 描述符包含在注入FIFO中。