Failover mechanisms in RDMA operations
    2.
    发明申请
    Failover mechanisms in RDMA operations 失效
    RDMA操作中的故障切换机制

    公开(公告)号:US20060045005A1

    公开(公告)日:2006-03-02

    申请号:US11017574

    申请日:2004-12-20

    IPC分类号: H04J1/16

    摘要: In remote direct memory access transfers in a multinode data processing system in which the nodes communicate with one another through communication adapters coupled to a switch or network, failures in the nodes or in the communication adapters can produce the phenomenon known as trickle traffic, which is data that has been received from the switch or from the network that is stale but which may have all the signatures of a valid packet data. The present invention addresses the trickle traffic problem in two situations: node failure and adapter failure. In the node failure situation randomly generated keys are used to reestablish connections to the adapter while providing a mechanism for the recognition of stale packets. In the adapter failure situation, a round robin context allocation approach is used with adapter state contexts being provided with state information which helps to identify stale packets. In another approach to handling the adapter failure situation counts are assigned which provide an adapter failure number to the node which will not match a corresponding number in a context field in the adapter, thus enabling the identification of stale packets.

    摘要翻译: 在多节点数据处理系统中的远程直接存储器访问传输中,其中节点通过耦合到交换机或网络的通信适配器彼此通信,节点或通信适配器中的故障可能产生称为流量流量的现象, 已经从交换机接收到的数据或者来自网络的数据已经过时,但是可能具有有效分组数据的所有签名。 本发明解决了两种情况下的流量流量问题:节点故障和适配器故障。 在节点故障情况下,随机生成的密钥用于重新建立与适配器的连接,同时提供用于识别过时数据包的机制。 在适配器故障情况下,使用循环上下文分配方法,适配器状态上下文被提供有状态信息,其有助于识别过时的分组。 在处理适配器故障情况的另一种方法中,分配了向适配器上下文字段中不匹配相应号码的节点提供适配器故障号,从而能够识别过时的数据包。

    RDMA server (OSI) global TCE tables
    5.
    发明申请
    RDMA server (OSI) global TCE tables 有权
    RDMA服务器(OSI)全局TCE表

    公开(公告)号:US20060047771A1

    公开(公告)日:2006-03-02

    申请号:US11017456

    申请日:2004-12-20

    IPC分类号: G06F15/16

    摘要: In remote direct memory access (RDMA) transfers in a multinode data processing system in which the nodes communicate with one another through communication adapters coupled to a switch or network, there is a need for the system to ensure efficient memory protection mechanisms across jobs. A method is thus desired for addressing virtual memory on local and remote servers that is independent of the process ID on the local and/or remote node. The use of global Translation Control Entry (TCE) tables that are accessed/owned by RDMA jobs and are managed by a device driver in conjunction with a Protocol Virtual Offset (PVO) address format solves this problem.

    摘要翻译: 在多节点数据处理系统中的远程直接存储器访问(RDMA)传输中,其中节点通过耦合到交换机或网络的通信适配器彼此通信,所以系统需要确保跨作业的有效的存储器保护机制。 因此,需要一种方法来解决本地和远程服务器上与本地和/或远程节点上的进程ID无关的虚拟内存。 使用由RDMA作业访问/拥有并由设备驱动程序与协议虚拟偏移(PVO)地址格式一起管理的全局翻译控制条目(TCE)表解决了此问题。

    Third party, broadcast, multicast and conditional RDMA operations
    6.
    发明申请
    Third party, broadcast, multicast and conditional RDMA operations 有权
    第三方,广播,组播和有条件的RDMA操作

    公开(公告)号:US20060045099A1

    公开(公告)日:2006-03-02

    申请号:US11017355

    申请日:2004-12-20

    IPC分类号: H04L12/56 H04L12/28

    CPC分类号: H04L69/16 H04L69/166

    摘要: In a multinode data processing system in which nodes exchange information over a network or through a switch, the mechanism which enables out-of-order data transfer via Remote Direct Memory Access (RDMA) also provides a corresponding ability to carry out broadcast operations, multicast operations, third party operations and conditional RDMA operations. In a broadcast operation a source node transfers data packets in RDMA fashion to a plurality of destination nodes. Multicast operation works similarly except that distribution is selective. In third party operations a single central node in a cluster or network manages the transfer of data in RDMA fashion between other nodes or creates a mechanism for allowing a directed distribution of data between nodes. In conditional operation mode the transfer of data is conditioned upon one or more events occurring in either the source node or in the destination node.

    摘要翻译: 在节点通过网络交换信息或通过交换机交换信息的多节点数据处理系统中,通过远程直接存储器访问(RDMA)实现无序数据传输的机制也提供了相应的能力来执行广播操作,组播 操作,第三方操作和有条件的RDMA操作。 在广播操作中,源节点以RDMA方式将数据分组传送到多个目的节点。 组播操作的工作方式类似,除了分发是有选择性的。 在第三方操作中,集群或网络中的单个中央节点以RDMA方式在其他节点之间管理数据传输,或者创建一种允许在节点之间定向分发数据的机制。 在条件操作模式中,数据的传输是在源节点或目标节点中发生的一个或多个事件的条件。

    Early interrupt notification in RDMA and in DMA operations
    7.
    发明申请
    Early interrupt notification in RDMA and in DMA operations 审中-公开
    RDMA和DMA操作中的早期中断通知

    公开(公告)号:US20060045109A1

    公开(公告)日:2006-03-02

    申请号:US11017573

    申请日:2004-12-20

    IPC分类号: H04L12/28

    CPC分类号: H04L67/1097 H04L69/32

    摘要: In a multinode data processing system in which data is transferred, via direct memory access (DMA) or in remote direct memory access (RDMA), from a source node to at least one destination node through communication adapters coupling each node to a network or switch, a method is provided in which interrupt handling is overlapped with data transfer so as to allow interrupt processing overhead to run in parallel at the destination node with the movement of data to provide performance benefits. The method is also applicable to situations involving multiple interrupt levels corresponding to multithreaded handling capabilities.

    摘要翻译: 在通过直接存储器访问(DMA)或远程直接存储器访问(RDMA))传输数据的多节点数据处理系统中,通过将每个节点耦合到网络或交换机的通信适配器,从源节点到至少一个目的地节点 提供了一种方法,其中中断处理与数据传输重叠,以便允许中断处理开销在目的地节点上并行运行,随着数据的移动而提供性能优势。 该方法也适用于涉及多线程处理能力的多个中断级别的情况。

    Speculative method and system for rapid data communications
    8.
    发明申请
    Speculative method and system for rapid data communications 有权
    快速数据通信的投机方法和系统

    公开(公告)号:US20050091390A1

    公开(公告)日:2005-04-28

    申请号:US10692496

    申请日:2003-10-24

    IPC分类号: G06F15/16 G06F15/17

    CPC分类号: G06F15/17

    摘要: A system and method that utilizes a dedicated transmission queue to enable expedited transmission of data messages to adaptive “nearest neighbor” nodes within a cluster. Packet descriptors are pre-fetched by the communications adapter hardware during the transmission of the preceding data element and setup for the next transmission is performed in parallel with the transmission of the preceding data element. Data elements of a fixed length that is equal to the cache line size of the communication hardware can optionally be used to provide optimized transfer between computer memory and communications hardware. The data receiving processing can also be optimized to recognize and handle cache line size data elements.

    摘要翻译: 一种使用专用传输队列来实现数据消息到集群内的自适应“最近邻”节点的快速传输的系统和方法。 在传输前面的数据元素期间,通信适配器硬件预取数据包描述符,并且与先前数据元素的传输并行执行用于下一个传输的建立。 可以可选地使用等于通信硬件的高速缓存行大小的固定长度的数据元素来提供计算机存储器和通信硬件之间的优化传输。 数据接收处理也可以被优化以识别和处理高速缓存行大小的数据元素。

    Communication resource reservation system for improved messaging performance
    9.
    发明申请
    Communication resource reservation system for improved messaging performance 审中-公开
    通信资源预留系统,提高消息传递性能

    公开(公告)号:US20060034167A1

    公开(公告)日:2006-02-16

    申请号:US10903322

    申请日:2004-07-30

    IPC分类号: H04L12/26

    CPC分类号: G06F15/17375

    摘要: A system and method are provided for facilitating zero-copy communications between computing systems of a group of computing systems. The method includes allocating, in a first computing system of the group of computing systems, a pool of privileged communication resources from a privileged resource controller to a communications controller. The communications controller designates the privileged communication resources from the pool for use in handling individual ones of the zero-copy communications, thereby avoiding a requirement to obtain individual ones of the privileged resources from the owner of the privileged resources at setup time for each zero-copy communication.

    摘要翻译: 提供了一种用于促进一组计算系统的计算系统之间的零复制通信的系统和方法。 该方法包括在该组计算系统的第一计算系统中将特权通信资源池从特权资源控制器分配给通信控制器。 通信控制器从池中指定用于处理零拷贝通信中的各个的特权通信资源,从而避免在建立时针对每个零拷贝通信从特权资源的所有者获得各个特权资源的要求, 复制通讯。

    Lazy deregistration of user virtual machine to adapter protocol virtual offsets

    公开(公告)号:US20060059242A1

    公开(公告)日:2006-03-16

    申请号:US11017570

    申请日:2004-12-20

    IPC分类号: G06F15/16

    CPC分类号: G06F12/1081

    摘要: A method is provided for operating a communications adapter employed in a multinode data processing system in a fashion which enhances the performance of remote direct memory access data transfers. The system is provided with pointers and a table which are employed to determine whether or not an address which has been supplied for the transfer has already been mapped to a real address at the source or destination node. The table is also preferably provided with counters which can be incremented or decremented to enable the use of least recently used mechanisms at the upper level protocol layers to more efficiently control the setting and resetting of table entries.