Pacing network traffic among a plurality of compute nodes connected using a data communications network
    71.
    发明授权
    Pacing network traffic among a plurality of compute nodes connected using a data communications network 有权
    在使用数据通信网络连接的多个计算节点之间调度网络流量

    公开(公告)号:US08140704B2

    公开(公告)日:2012-03-20

    申请号:US12166748

    申请日:2008-07-02

    IPC分类号: G06F15/16 H04L1/00

    CPC分类号: H04L47/10 H04L47/283

    摘要: Methods, apparatus, and products are disclosed for pacing network traffic among a plurality of compute nodes connected using a data communications network. The network has a plurality of network regions, and the plurality of compute nodes are distributed among these network regions. Pacing network traffic among a plurality of compute nodes connected using a data communications network includes: identifying, by a compute node for each region of the network, a roundtrip time delay for communicating with at least one of the compute nodes in that region; determining, by the compute node for each region, a pacing algorithm for that region in dependence upon the roundtrip time delay for that region; and transmitting, by the compute node, network packets to at least one of the compute nodes in at least one of the network regions in dependence upon the pacing algorithm for that region.

    摘要翻译: 公开了用于在使用数据通信网络连接的多个计算节点之间起搏网络业务的方法,装置和产品。 网络具有多个网络区域,并且多个计算节点分布在这些网络区域中。 在使用数据通信网络连接的多个计算节点之间起搏网络流量包括:由计算节点针对网络的每个区域识别用于与该区域中的至少一个计算节点进行通信的往返时间延迟; 根据所述区域的往返时间延迟,由所述计算节点为每个区域确定所述区域的起搏算法; 以及根据该区域的起搏算法,由计算节点将网络分组发送到至少一个网络区域中的至少一个计算节点。

    Opportunistic queueing injection strategy for network load balancing
    73.
    发明授权
    Opportunistic queueing injection strategy for network load balancing 有权
    用于网络负载平衡的机会排队注入策略

    公开(公告)号:US07944842B2

    公开(公告)日:2011-05-17

    申请号:US11738034

    申请日:2007-04-20

    IPC分类号: H04L12/26

    摘要: Embodiments of the invention include a method, system, and article of manufacture that provide opportunistic queuing injection strategy used for data communication between nodes of a parallel computer system. A message may be encapsulated into a set of data packets. When the packets are sent, an opportunistic injection queue may be configured to transmit them to multiple hardware injection ports. This approach allows for complete network link saturation. In a parallel system with network links in multiple dimensions, sending message packets using more than one dimension may substantially increase network throughput.

    摘要翻译: 本发明的实施例包括提供用于并行计算机系统的节点之间的数据通信的机会排队注入策略的方法,系统和制品。 消息可以被封装到一组数据分组中。 当发送数据包时,可以配置机会性注入队列将其发送到多个硬件注入端口。 这种方法允许完整的网络链路饱和。 在具有多个维度的网络链路的并行系统中,使用多个维度发送消息分组可以显着增加网络吞吐量。

    Locating hardware faults in a data communications network of a parallel computer
    74.
    发明授权
    Locating hardware faults in a data communications network of a parallel computer 失效
    在并行计算机的数据通信网络中查找硬件故障

    公开(公告)号:US07646721B2

    公开(公告)日:2010-01-12

    申请号:US11279586

    申请日:2006-04-13

    IPC分类号: H04L12/26

    CPC分类号: H04L12/66

    摘要: Hardware faults location in a data communications network of a parallel computer. Such a parallel computer includes a plurality of compute nodes and a data communications network that couples the compute nodes for data communications and organizes the compute node as a tree. Locating hardware faults includes identifying a next compute node as a parent node and a root of a parent test tree, identifying for each child compute node of the parent node a child test tree having the child compute node as root, running a same test suite on the parent test tree and each child test tree, and identifying the parent compute node as having a defective link connected from the parent compute node to a child compute node if the test suite fails on the parent test tree and succeeds on all the child test trees.

    摘要翻译: 并行计算机的数据通信网络中的硬件故障位置。 这样的并行计算机包括多个计算节点和数据通信网络,该数据通信网络将计算节点耦合用于数据通信,并将计算节点组织为树。 定位硬件故障包括将下一个计算节点标识为父节点和父测试树的根,为父节点的每个子计算节点标识具有子计算节点的子测试树作为根,运行相同的测试套件 父测试树和每个子测试树,并且如果测试套件在父测试树上失败并且在所有子测试树上成功,则将父计算节点识别为具有从父计算节点连接到子计算节点的有缺陷链路 。

    Profiling An Application For Power Consumption During Execution On A Compute Node
    75.
    发明申请
    Profiling An Application For Power Consumption During Execution On A Compute Node 有权
    在计算节点上分析执行期间的功耗应用程序

    公开(公告)号:US20100005326A1

    公开(公告)日:2010-01-07

    申请号:US12167302

    申请日:2008-07-03

    IPC分类号: G06F1/32

    摘要: Methods, apparatus, and products are disclosed for profiling an application for power consumption during execution on a compute node that include: receiving an application for execution on a compute node; identifying a hardware power consumption profile for the compute node, the hardware power consumption profile specifying power consumption for compute node hardware during performance of various processing operations; determining a power consumption profile for the application in dependence upon the application and the hardware power consumption profile for the compute node; and reporting the power consumption profile for the application.

    摘要翻译: 公开了方法,装置和产品,用于在计算节点执行期间对用于功耗的应用进行分析,所述计算节点包括:在计算节点上接收用于执行的应用; 识别所述计算节点的硬件功耗简档,所述硬件功耗简档在执行各种处理操作期间指定计算节点硬件的功耗; 根据应用和计算节点的硬件功耗特性,确定应用的功耗曲线; 并报告应用程序的功耗曲线。

    Effecting a Broadcast with an Allreduce Operation on a Parallel Computer
    76.
    发明申请
    Effecting a Broadcast with an Allreduce Operation on a Parallel Computer 失效
    在并行计算机上实现全反射广播

    公开(公告)号:US20090037511A1

    公开(公告)日:2009-02-05

    申请号:US11832918

    申请日:2007-08-02

    IPC分类号: G06F15/16

    CPC分类号: G06F9/542 G06F2209/543

    摘要: Methods, parallel computers, and computer program products are disclosed for effecting a broadcast with an allreduce operation on a parallel computer, the parallel computer comprising a plurality of compute nodes, the compute nodes organized into at least one operational group of compute nodes for collective parallel operations of the parallel computer, each compute node in the operational group assigned a unique rank, the compute nodes of the operational group coupled for data communications through a global combining network; and one compute node assigned to be a logical root. Embodiments include configuring, by the logical root node, a send buffer having a contribution to be broadcast to each ranked node in the operational group; configuring, by all ranked nodes other than the logical root, a receive buffer for receiving the contribution from the logical root; and repeatedly for each element of the contribution of the logical root in the send buffer: contributing, by the logical root, the element of the contribution in the send buffer; injecting, by all ranked nodes other than the logical root, one or more zeros corresponding to a size of the element; performing, by all the compute nodes of the operational group, an allreduce operation with a bitwise OR using the element and the injected zeros, yielding a result for the allreduce operation; and storing in each receive buffer, by all ranked nodes other than the logical root, the result of the allreduce.

    摘要翻译: 公开了方法,并行计算机和计算机程序产品,用于在并行计算机上实现具有全部还原操作的广播,该并行计算机包括多个计算节点,计算节点被组织成用于集体并行的至少一个运算组的计算节点 并行计算机的操作,操作组中的每个计算节点分配唯一的等级,操作组的计算节点通过全局组合网络耦合用于数据通信; 并且一个计算节点被分配为逻辑根。 实施例包括通过逻辑根节点将具有要广播的贡献的发送缓冲器配置到操作组中的每个排序节点; 由除逻辑根之外的所有排序节点配置用于从逻辑根接收贡献的接收缓冲器; 并且针对发送缓冲器中逻辑根的贡献的每个元素重复:由逻辑根贡献发送缓冲器中的贡献的元素; 由除逻辑根之外的所有排序的节点注入对应于该元素的大小的一个或多个零; 由操作组的所有计算节点执行使用该元素和被注入的零的具有按位OR的全部还原操作,产生全部还原操作的结果; 并且在除了逻辑根以外的所有排序节点的每个接收缓冲器中存储allreduce的结果。

    Controlling Data Transfers from an Origin Compute Node to a Target Compute Node
    77.
    发明申请
    Controlling Data Transfers from an Origin Compute Node to a Target Compute Node 失效
    控制从原始计算节点到目标计算节点的数据传输

    公开(公告)号:US20080301704A1

    公开(公告)日:2008-12-04

    申请号:US11754765

    申请日:2007-05-29

    IPC分类号: G06F13/14

    CPC分类号: G06F13/387

    摘要: Methods, apparatus, and products are disclosed for controlling data transfers from an origin compute node to a target compute node that include: receiving, by an application messaging module on the target compute node, an indication of a data transfer from an origin compute node to the target compute node; and administering, by the application messaging module on the target compute node, the data transfer using one or more messaging primitives of a system messaging module in dependence upon the indication.

    摘要翻译: 公开了用于控制从原始计算节点到目标计算节点的数据传输的方法,装置和产品,其包括:由目标计算节点上的应用消息传递模块从原始计算节点接收到从原始计算节点到 目标计算节点; 以及根据所述指示,通过所述目标计算节点上的所述应用消息传递模块来管理使用系统消息传送模块的一个或多个消息传递原语的数据传送。

    LATENCY HIDING MESSAGE PASSING PROTOCOL
    78.
    发明申请
    LATENCY HIDING MESSAGE PASSING PROTOCOL 失效
    暂时隐藏消息传递协议

    公开(公告)号:US20080222303A1

    公开(公告)日:2008-09-11

    申请号:US11682057

    申请日:2007-03-05

    IPC分类号: G06F15/16

    CPC分类号: G06F9/546

    摘要: A method, system, and article of manufacture that provide latency hiding, high bandwidth message passing protocols used for data communication between nodes of a parallel computer system are disclosed. A source node transmits a request to send message to a receiving node. Prior to receiving a clear to send message, the sending node continues to send deterministically routed (or fully described) data packets to the receiving node, thereby hiding the latency inherent in the request to send—clear to send message exchange. Once the sending node receives the clear to send message, any remaining portion of the message may be sent using partially described packets which may be routed dynamically, thereby maximizing bandwidth.

    摘要翻译: 公开了一种提供延迟隐藏,用于并行计算机系统的节点之间的数据通信的高带宽消息传递协议的方法,系统和制品。 源节点向接收节点发送发送消息的请求。 在接收到清除发送消息之前,发送节点继续向接收节点发送确定性路由(或完全描述)的数据分组,从而隐藏请求中固有的等待发送清除以发送消息交换。 一旦发送节点接收到清除发送消息,消息的任何剩余部分可以使用可以动态路由的部分描述的分组来发送,从而最大化带宽。

    Distributed hardware device simulation
    79.
    发明授权
    Distributed hardware device simulation 有权
    分布式硬件设备仿真

    公开(公告)号:US09317637B2

    公开(公告)日:2016-04-19

    申请号:US13006696

    申请日:2011-01-14

    摘要: Distributed hardware device simulation, including: identifying a plurality of hardware components of the hardware device; providing software components simulating the functionality of each hardware component, wherein the software components are installed on compute nodes of a distributed processing system; receiving, in at least one of the software components, one or more messages representing an input to the hardware component; simulating the operation of the hardware component with the software component, thereby generating an output of the software component representing the output of the hardware component; and sending, from the software component to at least one other software component, one or more messages representing the output of the hardware component.

    摘要翻译: 分布式硬件设备仿真,包括:识别硬件设备的多个硬件组件; 提供模拟每个硬件组件的功能的软件组件,其中所述软件组件安装在分布式处理系统的计算节点上; 在所述软件组件中的至少一个中接收表示对所述硬件组件的输入的一个或多个消息; 用软件组件模拟硬件组件的操作,从而生成表示硬件组件的输出的软件组件的输出; 以及从所述软件组件向至少一个其他软件组件发送表示所述硬件组件的输出的一个或多个消息。

    Endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface of a parallel computer
    80.
    发明授权
    Endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface of a parallel computer 有权
    基于端点的并行数据处理与并行计算机的并行活动消息接口中的非阻塞集体指令

    公开(公告)号:US08892850B2

    公开(公告)日:2014-11-18

    申请号:US13007848

    申请日:2011-01-17

    IPC分类号: G06F9/46 G06F9/54

    CPC分类号: G06F9/54

    摘要: Methods, apparatuses, and computer program products for endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface (‘PAMI’) of a parallel computer are provided. Embodiments include establishing by a parallel application a data communications geometry, the geometry specifying a set of endpoints that are used in collective operations of the PAMI, including associating with the geometry a list of collective algorithms valid for use with the endpoints of the geometry. Embodiments also include registering in each endpoint in the geometry a dispatch callback function for a collective operation and executing without blocking, through a single one of the endpoints in the geometry, an instruction for the collective operation.

    摘要翻译: 提供了一种用于并行计算机的并行主动消息传递接口(“PAMI”)中基于端点的并行数据处理与非阻塞集体指令的方法,设备和计算机程序产品。 实施例包括通过并行应用建立数据通信几何形状,指定在PAMI的集合操作中使用的一组端点的几何形状,包括与几何形状相关联的集合算法列表,该集合算法的列表可与几何的端点一起使用。 实施例还包括在几何中的每个端点中注册用于集体操作的分派回调函数,并且通过几何中的单个端点执行不阻塞用于集合操作的指令。