Host Fabric Interface (HFI) to Perform Global Shared Memory (GSM) Operations
    61.
    发明申请
    Host Fabric Interface (HFI) to Perform Global Shared Memory (GSM) Operations 失效
    主机结构接口(HFI)执行全局共享内存(GSM)操作

    公开(公告)号:US20090198918A1

    公开(公告)日:2009-08-06

    申请号:US12024397

    申请日:2008-02-01

    IPC分类号: G06F12/02

    CPC分类号: G06F12/109 G06F9/544

    摘要: A data processing system enables global shared memory (GSM) operations across multiple nodes with a distributed EA-to-RA mapping of physical memory. Each node has a host fabric interface (HFI), which includes HFI windows that are assigned to at most one locally-executing task of a parallel job. The tasks perform parallel job execution, but map only a portion of the effective addresses (EAs) of the global address space to the local, real memory of the task's respective node. The HFI window tags all outgoing GSM operations (of the local task) with the job ID, and embeds the target node and HFI window IDs of the node at which the EA is memory mapped. The HFI window also enables processing of received GSM operations with valid EAs that are homed to the local real memory of the receiving node, while preventing processing of other received operations without a valid EA-to-RA local mapping.

    摘要翻译: 数据处理系统通过物理内存的分布式EA-to-RA映射实现跨多个节点的全局共享存储(GSM)操作。 每个节点都有一个主机结构接口(HFI),它包括分配给并行作业最多一个本地执行任务的HFI窗口。 任务执行并行作业执行,但将全局地址空间的有效地址(EA)的一部分映射到任务相应节点的本地实际存储器。 HFI窗口使用作业ID对所有传出的GSM操作(本地任务)进行标记,并嵌入EA被映射到的节点的目标节点和HFI窗口ID。 HFI窗口还能够利用归属于接收节点的本地实际存储器的有效EA来处理接收的GSM操作,同时防止在没有有效的EA到RA本地映射的情况下处理其他接收到的操作。

    Issuing Global Shared Memory Operations Via Direct Cache Injection to a Host Fabric Interface
    62.
    发明申请
    Issuing Global Shared Memory Operations Via Direct Cache Injection to a Host Fabric Interface 有权
    通过直接缓存注入向主机接口发出全局共享内存操作

    公开(公告)号:US20090198891A1

    公开(公告)日:2009-08-06

    申请号:US12024437

    申请日:2008-02-01

    IPC分类号: G06F12/00 G06F12/08

    摘要: A data processing system enables global shared memory (GSM) operations across multiple nodes with a distributed EA-to-RA mapping of physical memory. Each node has a host fabric interface (HFI), which includes HFI windows that are assigned to at most one locally-executing task of a parallel job. The tasks perform parallel job execution, but map only a portion of the effective addresses (EAs) of the global address space to the local, real memory of the task's respective node. The HFI window tags all outgoing GSM operations (of the local task) with the job ID, and embeds the target node and HFI window IDs of the node at which the EA is memory mapped. The HFI window also enables processing of received GSM operations with valid EAs that are homed to the local real memory of the receiving node, while preventing processing of other received operations without a valid EA-to-RA local mapping.

    摘要翻译: 数据处理系统通过物理内存的分布式EA-to-RA映射实现跨多个节点的全局共享存储(GSM)操作。 每个节点都有一个主机结构接口(HFI),它包括分配给并行作业最多一个本地执行任务的HFI窗口。 任务执行并行作业执行,但将全局地址空间的有效地址(EA)的一部分映射到任务相应节点的本地实际存储器。 HFI窗口使用作业ID对所有传出的GSM操作(本地任务)进行标记,并嵌入EA被映射到的节点的目标节点和HFI窗口ID。 HFI窗口还能够利用归属于接收节点的本地实际存储器的有效EA来处理接收的GSM操作,同时防止在没有有效的EA到RA本地映射的情况下处理其他接收到的操作。

    System and program product to recover from node failure/recovery incidents in distributed systems in which notification does not occur
    63.
    发明授权
    System and program product to recover from node failure/recovery incidents in distributed systems in which notification does not occur 失效
    系统和程序产品,用于在不发生通知的分布式系统中从节点故障/恢复事件中恢复

    公开(公告)号:US08116210B2

    公开(公告)日:2012-02-14

    申请号:US12126371

    申请日:2008-05-23

    IPC分类号: G01R31/08 G06F11/00

    CPC分类号: B60R25/1003

    摘要: Epoch numbers are maintained in a pair wise fashion at a plurality of communication endpoints to provide communication consistency and recovery from a range of failure conditions including total or partial node failure and subsequent recovery. Once an epoch state inconsistency is recognized, negotiation procedures provide an effective mechanism to reestablish valid communication links without the need to employ global variables which inherently possess greater transmission and overhead requirements needed to maintain communications. Renegotiation of recognizably valid epoch numbers occurs on a pair wise basis.

    摘要翻译: 在多个通信端点以成对方式保持时代号,以从包括全部或部分节点故障和随后恢复的一系列故障条件提供通信一致性和恢复。 一旦识别出时代状态不一致,谈判程序就提供了有效的机制来重新建立有效的通信链路,而不需要使用固有地拥有维持通信所需的更大的传输和开销要求的全局变量。 可识别的有效时代数的重新协商是以双重依据为基础进行的。

    Locally providing globally consistent information to communications layers
    64.
    发明授权
    Locally providing globally consistent information to communications layers 失效
    本地向通信层提供全球一致的信息

    公开(公告)号:US08091092B2

    公开(公告)日:2012-01-03

    申请号:US11848050

    申请日:2007-08-30

    IPC分类号: G06F9/44

    摘要: Globally consistent information is locally provided to communications layers. Globally consistent information is stored in a Network Availability Matrix, which is locally accessible by a communications layer. If an event is detected, the communications layer is automatically notified by the Network Availability Matrix, and is able to use the information in the Network Availability Matrix to quickly take action.

    摘要翻译: 全球一致的信息在本地提供给通信层。 全球一致的信息存储在可由通信层本地访问的网络可用性矩阵中。 如果检测到事件,则通信层由网络可用性矩阵自动通知,并且能够使用网络可用性矩阵中的信息来快速采取行动。

    Multi-threaded polling in a processing environment
    65.
    发明申请
    Multi-threaded polling in a processing environment 审中-公开
    在处理环境中进行多线程轮询

    公开(公告)号:US20070150904A1

    公开(公告)日:2007-06-28

    申请号:US11273733

    申请日:2005-11-15

    IPC分类号: G06F9/46

    CPC分类号: G06F9/4843

    摘要: Processing within a multi-threaded processing environment is facilitated. A plurality of threads are employed to perform polling on a plurality of entities. The polling enables the concurrent driving of progress on the plurality of entities, as well as the detection of occurrence of a specified event across the plurality of entities and the termination of continued polling at the occurrence of this event.

    摘要翻译: 便于在多线程处理环境中进行处理。 采用多个线程对多个实体进行轮询。 轮询能够同时驱动多个实体上的进展,以及检测多个实体中的指定事件的发生以及在该事件发生时终止连续轮询。

    Method to recover from node failure/recovery incidents in distributed systems in which notification does not occur
    66.
    发明授权
    Method to recover from node failure/recovery incidents in distributed systems in which notification does not occur 失效
    在不发生通知的分布式系统中从节点故障/恢复事件中恢复的方法

    公开(公告)号:US07379444B2

    公开(公告)日:2008-05-27

    申请号:US10351823

    申请日:2003-01-27

    IPC分类号: H04L1/12 H04L12/26

    摘要: Epoch numbers are maintained in a pair wise fashion at a plurality of communication endpoints to provide communication consistency and recovery from a range of failure conditions including total or partial node failure and subsequent recovery. Once an epoch state inconsistency is recognized, negotiation procedures provide an effective mechanism to reestablish valid communication links without the need to employ global variables which inherently possess greater transmission and overhead requirements needed to maintain communications. Renegotiation of recognizably valid epoch numbers occurs on a pair wise basis.

    摘要翻译: 在多个通信端点以成对方式保持时代号,以从包括全部或部分节点故障和随后恢复的一系列故障条件提供通信一致性和恢复。 一旦识别出时代状态不一致,谈判程序就提供了有效的机制来重新建立有效的通信链路,而不需要使用固有地拥有维持通信所需的更大的传输和开销要求的全局变量。 可识别的有效时代数的重新协商是以双重依据为基础进行的。

    LOCALLY PROVIDING GLOBALLY CONSISTENT INFORMATION TO COMMUNICATIONS LAYERS
    67.
    发明申请
    LOCALLY PROVIDING GLOBALLY CONSISTENT INFORMATION TO COMMUNICATIONS LAYERS 有权
    向通信层提供全球一致的信息

    公开(公告)号:US20080086737A1

    公开(公告)日:2008-04-10

    申请号:US11942993

    申请日:2007-11-20

    IPC分类号: G06F13/10

    摘要: Globally consistent information is locally provided to communications layers. Globally consistent information is stored in a Network Availability Matrix, which is locally accessible by a communications layer. If an event is detected, the communications layer is automatically notified by the Network Availability Matrix, and is able to use the information in the Network Availability Matrix to quickly take action.

    摘要翻译: 全球一致的信息在本地提供给通信层。 全球一致的信息存储在可由通信层本地访问的网络可用性矩阵中。 如果检测到事件,则通信层由网络可用性矩阵自动通知,并且能够使用网络可用性矩阵中的信息来快速采取行动。

    Locally providing globally consistent information to communications layers
    68.
    发明授权
    Locally providing globally consistent information to communications layers 有权
    本地向通信层提供全球一致的信息

    公开(公告)号:US07996851B2

    公开(公告)日:2011-08-09

    申请号:US11942993

    申请日:2007-11-20

    IPC分类号: G06F9/44

    摘要: Globally consistent information is locally provided to communications layers. Globally consistent information is stored in a Network Availability Matrix, which is locally accessible by a communications layer. If an event is detected, the communications layer is automatically notified by the Network Availability Matrix, and is able to use the information in the Network Availability Matrix to quickly take action.

    摘要翻译: 全球一致的信息在本地提供给通信层。 全球一致的信息存储在可由通信层本地访问的网络可用性矩阵中。 如果检测到事件,则通信层由网络可用性矩阵自动通知,并且能够使用网络可用性矩阵中的信息来快速采取行动。

    Method and apparatus for striping message payload data over a network
    69.
    发明授权
    Method and apparatus for striping message payload data over a network 失效
    用于通过网络分段消息有效载荷数据的方法和装置

    公开(公告)号:US07835359B2

    公开(公告)日:2010-11-16

    申请号:US11298322

    申请日:2005-12-08

    IPC分类号: H04L12/28 H04L12/54

    摘要: A method, an apparatus and a recording medium are provided for communicating message payload data, especially noncontiguous message data, from a first node of a network to a second node of the network in response to a request to transmit a message. Such method includes dividing the length of a data payload to be transmitted into a plurality of submessage payload lengths, i.e., into at least a first submessage payload length and a second submessage payload length. Then, a first ordered submessage is transmitted from the first node for delivery to the second node, the first ordered submessage having the first submessage payload length. A first state of an environment is then determined in the first node as if the step of transmitting the first ordered submessage were already completed. Without having to complete the step of transmitting the first ordered submessage, a second ordered submessage is then transmitted from the first node for delivery to the second node, the second submessage having the second submessage payload length, the second submessage being transmitted in a way that takes into account the first state of the environment in the first node.

    摘要翻译: 提供了一种方法,装置和记录介质,用于响应于发送消息的请求,将消息有效载荷数据,尤其是不连续消息数据从网络的第一节点传送到网络的第二节点。 这种方法包括将待传输的数据有效载荷的长度划分成多个子消息有效载荷长度,即至少是第一子消息有效负载长度和第二子消息有效载荷长度。 然后,从第一节点发送第一有序子消息以传送到第二节点,第一有序子消息具有第一消息有效载荷长度。 然后在第一节点中确定环境的第一状态,就好像传送第一个有序子消息的步骤已经完成。 而不必完成发送第一有序子消息的步骤,然后从第一节点发送第二有序子消息以便传送到第二节点,第二子消息具有第二消息有效负载长度,第二子消息以如下方式发送: 考虑到第一个节点的环境的第一个状态。

    SYSTEM AND PROGRAM PRODUCT TO RECOVER FROM NODE FAILURE/RECOVERY INCIDENTS IN DISTRIBUTED SYSTEMS IN WHICH NOTIFICATION DOES NOT OCCUR
    70.
    发明申请
    SYSTEM AND PROGRAM PRODUCT TO RECOVER FROM NODE FAILURE/RECOVERY INCIDENTS IN DISTRIBUTED SYSTEMS IN WHICH NOTIFICATION DOES NOT OCCUR 失效
    系统和程序产品从通知不会发生的分布式系统中的节点故障/恢复事故中恢复

    公开(公告)号:US20080225702A1

    公开(公告)日:2008-09-18

    申请号:US12126371

    申请日:2008-05-23

    IPC分类号: G06F11/00

    CPC分类号: B60R25/1003

    摘要: Epoch numbers are maintained in a pair wise fashion at a plurality of communication endpoints to provide communication consistency and recovery from a range of failure conditions including total or partial node failure and subsequent recovery. Once an epoch state inconsistency is recognized, negotiation procedures provide an effective mechanism to reestablish valid communication links without the need to employ global variables which inherently possess greater transmission and overhead requirements needed to maintain communications. Renegotiation of recognizably valid epoch numbers occurs on a pair wise basis.

    摘要翻译: 在多个通信端点以成对方式保持时代号,以从包括全部或部分节点故障和随后恢复的一系列故障条件提供通信一致性和恢复。 一旦识别出时代状态不一致,谈判程序就提供了有效的机制来重新建立有效的通信链路,而不需要使用固有地拥有维持通信所需的更大的传输和开销要求的全局变量。 可识别的有效时代数的重新协商是以双重依据为基础进行的。