Hardware packet pacing using a DMA in a parallel computer
    1.
    发明授权
    Hardware packet pacing using a DMA in a parallel computer 有权
    在并行计算机中使用DMA的硬件包起搏

    公开(公告)号:US08509255B2

    公开(公告)日:2013-08-13

    申请号:US11768682

    申请日:2007-06-26

    IPC分类号: H04L12/28

    CPC分类号: G06F13/128

    摘要: Method and system for hardware packet pacing using a direct memory access controller in a parallel computer which, in one aspect, keeps track of a total number of bytes put on the network as a result of a remote get operation, using a hardware token counter.

    摘要翻译: 使用并行计算机中的直接存储器访问控制器的硬件分组起搏的方法和系统,其在一个方面使用硬件令牌计数器来跟踪作为远程获取操作的结果在网络上的总字节数。

    BAD DATA PACKET CAPTURE DEVICE
    2.
    发明申请
    BAD DATA PACKET CAPTURE DEVICE 失效
    坏数据包捕获设备

    公开(公告)号:US20090003228A1

    公开(公告)日:2009-01-01

    申请号:US11768572

    申请日:2007-06-26

    IPC分类号: H04L12/56

    CPC分类号: H04L43/0847

    摘要: An apparatus and method for capturing data packets for analysis on a network computing system includes a sending node and a receiving node connected by a bi-directional communication link. The sending node sends a data transmission to the receiving node on the bi-directional communication link, and the receiving node receives the data transmission and verifies the data transmission to determine valid data and invalid data and verify retransmissions of invalid data as corresponding valid data. A memory device communicates with the receiving node for storing the invalid data and the corresponding valid data. A computing node communicates with the memory device and receives and performs an analysis of the invalid data and the corresponding valid data received from the memory device.

    摘要翻译: 用于捕获数据分组以用于在网络计算系统上进行分析的装置和方法包括通过双向通信链路连接的发送节点和接收节点。 发送节点向双向通信链路上的接收节点发送数据传输,接收节点接收数据传输,验证数据传输,确定有效数据和无效数据,并验证无效数据的重传是对应的有效数据。 存储装置与接收节点进行通信,用于存储无效数据和对应的有效数据。 计算节点与存储器件进行通信,并且接收并执行从存储器件接收的无效数据和对应的有效数据的分析。

    HARDWARE PACKET PACING USING A DMA IN A PARALLEL COMPUTER
    3.
    发明申请
    HARDWARE PACKET PACING USING A DMA IN A PARALLEL COMPUTER 有权
    使用并行计算机中的DMA的硬件分组

    公开(公告)号:US20090003203A1

    公开(公告)日:2009-01-01

    申请号:US11768682

    申请日:2007-06-26

    IPC分类号: H04L1/00

    CPC分类号: G06F13/128

    摘要: Method and system for hardware packet pacing using a direct memory access controller in a parallel, in one aspect, keeps track of a total number of bytes put on the network as a result of a remote get operation, using a hardware token counter. A remote get message is sent as a plurality of sub remote get packets. Each of the sub remote get packets is sent if the total number of bytes put on the network does not exceed a predetermined number.

    摘要翻译: 在一方面,使用直接存储器访问控制器并行地进行硬件分组起搏的方法和系统,使用硬件令牌计数器来跟踪作为远程获取操作的结果在网络上的总字节数。 作为多个子远程获取分组发送远程获取消息。 如果网络上的总字节数不超过预定数量,则发送每个子远程获取数据包。

    Extended write combining using a write continuation hint flag
    4.
    发明授权
    Extended write combining using a write continuation hint flag 失效
    使用写入连续提示标志进行扩展写入组合

    公开(公告)号:US08458282B2

    公开(公告)日:2013-06-04

    申请号:US11768593

    申请日:2007-06-26

    摘要: A computing apparatus for reducing the amount of processing in a network computing system which includes a network system device of a receiving node for receiving electronic messages comprising data. The electronic messages are transmitted from a sending node. The network system device determines when more data of a specific electronic message is being transmitted. A memory device stores the electronic message data and communicating with the network system device. A memory subsystem communicates with the memory device. The memory subsystem stores a portion of the electronic message when more data of the specific message will be received, and the buffer combines the portion with later received data and moves the data to the memory device for accessible storage.

    摘要翻译: 一种用于减少网络计算系统中的处理量的计算装置,其包括用于接收包括数据的电子消息的接收节点的网络系统设备。 从发送节点发送电子消息。 网络系统设备确定何时正在发送特定电子消息的更多数据。 存储装置存储电子消息数据并与网络系统装置进行通信。 存储器子系统与存储器件通信。 当更多的特定消息的数据将被接收时,存储器子系统存储电子消息的一部分,并且缓冲器将该部分与稍后接收的数据组合,并将数据移动到存储器装置以进行存取。

    Direct Memory Access Transfer Completion Notification
    5.
    发明申请
    Direct Memory Access Transfer Completion Notification 失效
    直接内存访问传输完成通知

    公开(公告)号:US20080307121A1

    公开(公告)日:2008-12-11

    申请号:US11758167

    申请日:2007-06-05

    IPC分类号: G06F13/28

    CPC分类号: G06F13/28

    摘要: Methods, compute nodes, and computer program products are provided for direct memory access (‘DMA’) transfer completion notification. Embodiments include determining, by an origin DMA engine on an origin compute node, whether a data descriptor for an application message to be sent to a target compute node is currently in an injection first-in-first-out (‘FIFO’) buffer in dependence upon a sequence number previously associated with the data descriptor, the total number of descriptors currently in the injection FIFO buffer, and the current sequence number for the newest data descriptor stored in the injection FIFO buffer; and notifying a processor core on the origin DMA engine that the message has been sent if the data descriptor for the message is not currently in the injection FIFO buffer.

    摘要翻译: 提供方法,计算节点和计算机程序产品用于直接内存访问(“DMA”)传输完成通知。 实施例包括通过原始计算节点上的原始DMA引擎确定要发送到目标计算节点的应用消息的数据描述符当前是否处于先进先出先入先出(FIFO)缓冲器 依赖于先前与数据描述符相关联的序列号,当前在注入FIFO缓冲器中的描述符的总数以及存储在注入FIFO缓冲器中的最新数据描述符的当前序列号; 并且如果消息的数据描述符当前不在注入FIFO缓冲器中,则通知源DMA引擎上的处理器核心消息已被发送。

    METHOD AND APPARATUS FOR EFFICIENTLY TRACKING QUEUE ENTRIES RELATIVE TO A TIMESTAMP
    6.
    发明申请
    METHOD AND APPARATUS FOR EFFICIENTLY TRACKING QUEUE ENTRIES RELATIVE TO A TIMESTAMP 失效
    有效跟踪与TIMESTAMP相关的队列的方法和设备

    公开(公告)号:US20090006672A1

    公开(公告)日:2009-01-01

    申请号:US11768800

    申请日:2007-06-26

    IPC分类号: G06F3/00 G06F1/04

    CPC分类号: G06F12/0835 G06F12/0831

    摘要: An apparatus and method for tracking coherence event signals transmitted in a multiprocessor system. The apparatus comprises a coherence logic unit, each unit having a plurality of queue structures with each queue structure associated with a respective sender of event signals transmitted in the system. A timing circuit associated with a queue structure controls enqueuing and dequeuing of received coherence event signals, and, a counter tracks a number of coherence event signals remaining enqueued in the queue structure and dequeued since receipt of a timestamp signal. A counter mechanism generates an output signal indicating that all of the coherence event signals present in the queue structure at the time of receipt of the timestamp signal have been dequeued. In one embodiment, the timestamp signal is asserted at the start of a memory synchronization operation and, the output signal indicates that all coherence events present when the timestamp signal was asserted have completed. This signal can then be used as part of the completion condition for the memory synchronization operation.

    摘要翻译: 一种用于跟踪在多处理器系统中发送的相干事件信号的装置和方法。 该装置包括相干逻辑单元,每个单元具有多个队列结构,每个队列结构与在系统中传输的事件信号的相应发送者相关联。 与队列结构相关联的定时电路控制接收的相干事件信号的排队和出队,并且计数器跟踪队列结构中剩余入队的多个相干事件信号,并且从接收到时间戳信号起出队。 计数器机构产生一个输出信号,指示在接收时间戳信号时存在于队列结构中的所有相干事件信号已经出队。 在一个实施例中,时间戳信号在存储器同步操作的开始被断言,并且输出信号指示当时间戳信号被断言时存在的所有相干事件已经完成。 然后可以将该信号用作存储器同步操作的完成条件的一部分。

    DMA SHARED BYTE COUNTERS IN A PARALLEL COMPUTER
    7.
    发明申请
    DMA SHARED BYTE COUNTERS IN A PARALLEL COMPUTER 失效
    DMA并发计算机中的共享字节计数器

    公开(公告)号:US20090006666A1

    公开(公告)日:2009-01-01

    申请号:US11768781

    申请日:2007-06-26

    IPC分类号: G06F13/28

    CPC分类号: G06F13/28 Y02D10/14

    摘要: A parallel computer system is constructed as a network of interconnected compute nodes. Each of the compute nodes includes at least one processor, a memory and a DMA engine. The DMA engine includes a processor interface for interfacing with the at least one processor, DMA logic, a memory interface for interfacing with the memory, a DMA network interface for interfacing with the network, injection and reception byte counters, injection and reception FIFO metadata, and status registers and control registers. The injection FIFOs maintain memory locations of the injection FIFO metadata memory locations including its current head and tail, and the reception FIFOs maintain the reception FIFO metadata memory locations including its current head and tail. The injection byte counters and reception byte counters may be shared between messages.

    摘要翻译: 并行计算机系统被构造为互连计算节点的网络。 每个计算节点包括至少一个处理器,存储器和DMA引擎。 DMA引擎包括用于与至少一个处理器连接的处理器接口,DMA逻辑,用于与存储器连接的存储器接口,用于与网络接口的DMA网络接口,注入和接收字节计数器,注入和接收FIFO元数据, 和状态寄存器和控制寄存器。 注入FIFO保持注入FIFO元数据存储器位置的存储器位置,包括其当前头部和尾部,并且接收FIFO保持包括其当前头部和尾部的接收FIFO元数据存储器位置。 注入字节计数器和接收字节计数器可以在消息之间共享。

    DMA ENGINE FOR REPEATING COMMUNICATION PATTERNS
    8.
    发明申请
    DMA ENGINE FOR REPEATING COMMUNICATION PATTERNS 失效
    DMA引擎重复通信模式

    公开(公告)号:US20090006296A1

    公开(公告)日:2009-01-01

    申请号:US11768795

    申请日:2007-06-26

    IPC分类号: G06F15/18

    CPC分类号: G06F15/163

    摘要: A parallel computer system is constructed as a network of interconnected compute nodes to operate a global message-passing application for performing communications across the network. Each of the compute nodes includes one or more individual processors with memories which run local instances of the global message-passing application operating at each compute node to carry out local processing operations independent of processing operations carried out at other compute nodes. Each compute node also includes a DMA engine constructed to interact with the application via Injection FIFO Metadata describing multiple Injection FIFOs where each Injection FIFO may containing an arbitrary number of message descriptors in order to process messages with a fixed processing overhead irrespective of the number of message descriptors included in the Injection FIFO.

    摘要翻译: 并行计算机系统被构造为互连的计算节点的网络,以操作用于在整个网络上执行通信的全局消息传递应用。 每个计算节点包括具有存储器的一个或多个单独处理器,该存储器运行在每个计算节点处操作的全局消息传递应用的本地实例,以独立于在其他计算节点执行的处理操作来执行本地处理操作。 每个计算节点还包括构造成通过描述多个注入FIFO的注入FIFO元数据与应用交互的DMA引擎,其中每个注入FIFO可以包含任意数量的消息描述符,以便处理具有固定处理开销的消息,而不管消息的数量 描述符包含在注入FIFO中。

    LOW LATENCY MEMORY ACCESS AND SYNCHRONIZATION
    10.
    发明申请
    LOW LATENCY MEMORY ACCESS AND SYNCHRONIZATION 失效
    低延迟存储器访问和同步

    公开(公告)号:US20070204112A1

    公开(公告)日:2007-08-30

    申请号:US11617276

    申请日:2006-12-28

    IPC分类号: G06F12/14

    摘要: A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefetch rather than some other predictive algorithm. This enables hardware to effectively prefetch memory access patterns that are non-contiguous, but repetitive.

    摘要翻译: 与弱有序的多处理器系统相关联地提供低延迟存储器系统访问。 多处理器中的每个处理器共享资源,并且每个共享资源在锁定设备内具有关联的锁,其提供对多处理器中的多个处理器之间的同步的支持以及资源的有序共享。 当处理器拥有与该资源相关联的锁定时,处理器仅具有访问资源的权限,并且处理器拥有锁的尝试仅需要单个加载操作,而不是传统的原子负载后跟存储,使得处理器 只执行读取操作,并且硬件锁定装置执行后续的写入操作而不是处理器。 还公开了用于非连续数据结构的简单预取。 重新定义存储器线,使得除了正常的物理存储器数据之外,每行包括足够大的指针以指向存储器中的任何其他行,其中指针用于确定要预取的存储器行而不是一些其它预测 算法。 这使得硬件能够有效地预取不连续但重复的存储器访问模式。