Collective operation protocol selection in a parallel computer
    91.
    发明授权
    Collective operation protocol selection in a parallel computer 有权
    并行计算机中的集体操作协议选择

    公开(公告)号:US08893083B2

    公开(公告)日:2014-11-18

    申请号:US13206116

    申请日:2011-08-09

    摘要: Collective operation protocol selection in a parallel computer that includes compute nodes may be carried out by calling a collective operation with operating parameters; selecting a protocol for executing the operation and executing the operation with the selected protocol. Selecting a protocol includes: iteratively, until a prospective protocol meets predetermined performance criteria: providing, to a protocol performance function for the prospective protocol, the operating parameters; determining whether the prospective protocol meets predefined performance criteria by evaluating a predefined performance fit equation, calculating a measure of performance of the protocol for the operating parameters; determining that the prospective protocol meets predetermined performance criteria and selecting the protocol for executing the operation only if the calculated measure of performance is greater than a predefined minimum performance threshold.

    摘要翻译: 包括计算节点的并行计算机中的集体操作协议选择可以通过调用具有操作参数的集合操作来执行; 选择用于执行操作的协议并且使用所选择的协议来执行操作。 选择协议包括:迭代地,直到预期协议满足预定的性能标准:向前瞻协议的协议性能函数提供操作参数; 通过评估预定义的性能拟合方程来确定所述预期协议是否满足预定义的性能标准,计算所述协议对于所述运行参数的性能的度量; 确定预期协议满足预定性能标准,并且仅当所计算的性能测量值大于预定义的最小性能阈值时才选择用于执行操作的协议。

    Scheduling synchronization in association with collective operations in a parallel computer
    92.
    发明授权
    Scheduling synchronization in association with collective operations in a parallel computer 有权
    在并行计算机中与集体操作调度同步

    公开(公告)号:US08869168B2

    公开(公告)日:2014-10-21

    申请号:US13470932

    申请日:2012-05-14

    IPC分类号: G06F9/44 G06F15/167

    CPC分类号: G06F15/17325

    摘要: Methods, apparatuses, and computer program products for scheduling synchronization in association with collective operations in a parallel computer that includes a shared memory and a plurality of compute nodes that execute a parallel application utilizing the shared memory are provided. Embodiments include acquiring an available channel of the shared memory; posting to the acquired channel of the shared memory one or more collective operations and a synchronization point; determining that processing within the acquired channel has reached the synchronization point; and posting to the acquired channel, in response to determining that processing within the acquired channel has reached the synchronization point, a background synchronization operation corresponding to the one or more collective operations.

    摘要翻译: 提供了用于在并行计算机中与集合操作相关联的用于调度同步的方法,装置和计算机程序产品,所述并行计算机包括共享存储器和使用所述共享存储器执行并行应用的多个计算节点。 实施例包括获取共享存储器的可用信道; 向共享存储器的获取通道发布一个或多个集合操作和同步点; 确定所获取的信道内的处理已经达到同步点; 并且响应于确定所获取的信道中的处理已经到达同步点,发布到获取的信道,对应于一个或多个集合操作的后台同步操作。

    Performing a vector collective operation on a parallel computer having a plurality of compute nodes
    93.
    发明授权
    Performing a vector collective operation on a parallel computer having a plurality of compute nodes 失效
    在具有多个计算节点的并行计算机上执行向量集合操作

    公开(公告)号:US08549259B2

    公开(公告)日:2013-10-01

    申请号:US12882295

    申请日:2010-09-15

    IPC分类号: G06F15/76

    CPC分类号: G06F15/8092

    摘要: Systems, methods and articles of manufacture are disclosed for performing a vector collective operation on a parallel computing system that includes multiple compute nodes and a network connecting the compute nodes that includes an ALU. A collective operation may be performed to determine displacements for the vector collective operation. Descriptors for the vector collective operation may be generated based on the displacements. The vector collective operation may then be performed using the descriptors.

    摘要翻译: 公开了用于在包括多个计算节点的并行计算系统上执行向量集合操作的系统,方法和制品,以及连接包括ALU的计算节点的网络。 可以执行集体操作以确定矢量集合操作的位移。 可以基于位移产生矢量集合操作的描述符。 然后可以使用描述符来执行向量集合操作。

    Internode data communications in a parallel computer
    95.
    发明授权
    Internode data communications in a parallel computer 失效
    并行计算机中的国际数据通信

    公开(公告)号:US08528004B2

    公开(公告)日:2013-09-03

    申请号:US13290642

    申请日:2011-11-07

    CPC分类号: G06F9/544

    摘要: Internode data communications in a parallel computer that includes compute nodes that each include main memory and a messaging unit, the messaging unit including computer memory and coupling compute nodes for data communications, in which, for each compute node at compute node boot time: a messaging unit allocates, in the messaging unit's computer memory, a predefined number of message buffers, each message buffer associated with a process to be initialized on the compute node; receives, prior to initialization of a particular process on the compute node, a data communications message intended for the particular process; and stores the data communications message in the message buffer associated with the particular process. Upon initialization of the particular process, the process establishes a messaging buffer in main memory of the compute node and copies the data communications message from the message buffer of the messaging unit into the message buffer of main memory.

    摘要翻译: 并行计算机中的国际数据通信包括计算节点,每个计算节点包括主存储器和消息传送单元,消息传送单元包括计算机存储器和耦合用于数据通信的计算节点,其中针对计算节点启动时的每个计算节点:消息 单元在消息接发单元的计算机存储器中分配预定数量的消息缓冲器,每个消息缓冲器与在计算节点上被初始化的进程相关联; 在计算节点上的特定进程的初始化之前接收用于该特定进程的数据通信消息; 并将数据通信消息存储在与特定进程相关联的消息缓冲器中。 在特定进程的初始化时,该过程在计算节点的主存储器中建立消息缓存器,并将数据通信消息从消息传送单元的消息缓冲器复制到主存储器的消息缓冲器中。

    Managing Internode Data Communications For An Uninitialized Process In A Parallel Computer
    96.
    发明申请
    Managing Internode Data Communications For An Uninitialized Process In A Parallel Computer 有权
    管理并行计算机中未初始化流程的国际数据通信

    公开(公告)号:US20130117403A1

    公开(公告)日:2013-05-09

    申请号:US13292293

    申请日:2011-11-09

    IPC分类号: G06F15/167

    摘要: A parallel computer includes nodes, each having main memory and a messaging unit (MU). Each MU includes computer memory, which in turn includes, MU message buffers. Each MU message buffer is associated with an uninitialized process on the compute node. In the parallel computer, managing internode data communications for an uninitialized process includes: receiving, by an MU of a compute node, one or more data communications messages in an MU message buffer associated with an uninitialized process on the compute node; determining, by an application agent, that the MU message buffer associated with the uninitialized process is full prior to initialization of the uninitialized process; establishing, by the application agent, a temporary message buffer for the uninitialized process in main computer memory; and moving, by the application agent, data communications messages from the MU message buffer associated with the uninitialized process to the temporary message buffer in main computer memory.

    摘要翻译: 并行计算机包括各自具有主存储器和消息传送单元(MU)的节点。 每个MU包括计算机存储器,其又包括MU消息缓冲器。 每个MU消息缓冲区与计算节点上的未初始化进程相关联。 在并行计算机中,管理未初始化过程的节点间数据通信包括:由计算节点的MU接收与计算节点上的未初始化过程相关联的MU消息缓冲器中的一个或多个数据通信消息; 由应用代理确定与未初始化过程相关联的MU消息缓冲器在未初始化过程的初始化之前已满; 由应用代理建立用于主计算机存储器中未初始化过程的临时消息缓冲器; 并且由应用代理将与未初始化过程相关联的MU消息缓冲器的数据通信消息移动到主计算机存储器中的临时消息缓冲器。

    Providing A User With A Graphics Based IDE For Developing Software For Distributed Computing Systems
    97.
    发明申请
    Providing A User With A Graphics Based IDE For Developing Software For Distributed Computing Systems 审中-公开
    为基于图形的IDE提供用于开发分布式计算系统软件的用户

    公开(公告)号:US20130086551A1

    公开(公告)日:2013-04-04

    申请号:US12788475

    申请日:2010-05-27

    IPC分类号: G06F9/44

    CPC分类号: G06F8/34

    摘要: Graphics based IDE for distributed computing systems software development including providing a graphical representation of a topology of a distributed computing system for which the user is to develop a software application; receiving an identification of a system component upon which a portion of the application is to execute; providing a text editor for receiving from the user computer program instructions forming the portion of the application; inserting, without user intervention as part of the portion of the application, predetermined computer program instructions configured to support the identified system component; receiving, through the text editor, the portion of the application including the predetermined computer program instructions configured to support the identified system component; and storing, the computer program instructions forming the portion of the application, at a user specified location within the application.

    摘要翻译: 用于分布式计算系统软件开发的基于图形的IDE,包括提供用户开发软件应用程序的分布式计算系统的拓扑的图形表示; 接收应用程序的一部分要执行的系统组件的标识; 提供文本编辑器,用于从所述用户计算机程序接收形成所述应用的所述部分的指令; 在没有用户干预的情况下插入作为应用的该部分的一部分,预定的计算机程序指令被配置为支持所识别的系统组件; 通过文本编辑器接收包括被配置为支持所识别的系统组件的预定计算机程序指令的应用程序部分; 以及将形成所述应用的所述部分的所述计算机程序指令存储在所述应用程序内的用户指定位置处。

    Endpoint-Based Parallel Data Processing In A Parallel Active Messaging Interface Of A Parallel Computer
    98.
    发明申请
    Endpoint-Based Parallel Data Processing In A Parallel Active Messaging Interface Of A Parallel Computer 失效
    并行计算机并行主动消息接口中基于端点的并行数据处理

    公开(公告)号:US20120254344A1

    公开(公告)日:2012-10-04

    申请号:US12963671

    申请日:2010-12-09

    IPC分类号: G06F15/16

    CPC分类号: G06F9/541

    摘要: Endpoint-based parallel data processing in a parallel active messaging interface (‘PAMI’) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective operation through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.

    摘要翻译: 并行计算机的并行主动消息传递接口(PAMI)中的基于端点的并行数据处理,由数据通信端点组成的PAMI,每个端点包括计算节点上执行线程的数据通信参数的规范,包括规范 客户端,上下文和任务,用于通过PAMI进行数据通信的计算节点,包括建立数据通信几何,用于表示并行应用执行过程的任务的几何形状,使用的一组端点 在PAMI的集体操作中,包括用于任务之一的多个端点; 在几何的端点中接收集体操作的指令; 以及根据所述几何,通过所述端点执行用于集体操作的指令,包括为所述任务之一划分所述多个端点之间的数据通信操作。

    Data Communications In A Parallel Active Messaging Interface Of A Parallel Computer
    99.
    发明申请
    Data Communications In A Parallel Active Messaging Interface Of A Parallel Computer 有权
    并行计算机并行主动消息接口中的数据通信

    公开(公告)号:US20120137294A1

    公开(公告)日:2012-05-31

    申请号:US12956903

    申请日:2010-11-30

    IPC分类号: G06F9/46

    CPC分类号: G06F9/546

    摘要: Data communications in a parallel active messaging interface (‘PAMI’) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI and through data communications resources, including receiving in an origin endpoint of the PAMI a SEND instruction, the SEND instruction specifying a transmission of transfer data from the origin endpoint to a first target endpoint; transmitting from the origin endpoint to the first target endpoint a Request-To-Send (‘RTS’) message advising the first target endpoint of the location and size of the transfer data; assigning by the first target endpoint to each of a plurality of target endpoints separate portions of the transfer data; and receiving by the plurality of target endpoints the transfer data.

    摘要翻译: 并行计算机的并行活动消息接口(“PAMI”)中的数据通信,由数据通信端点组成的PAMI,每个端点包括用于计算节点上的执行线程的数据通信参数的规范,包括客户端的规范 ,上下文和任务,通过PAMI和通过数据通信资源耦合用于数据通信的端点,包括在PAMI的原点端点接收SEND指令,SEND指令指定传输数据从原始端点传输到 第一个目标终点; 从所述起始端点向所述第一目标端点发送向所述第一目标端点通知所述传送数据的位置和大小的请求发送('RTS')消息; 将第一目标端点分配给多个目标端点中的每一个分离传输数据的部分; 以及由所述多个目标端点接收所述传送数据。

    Administering Truncated Receive Functions In A Parallel Messaging Interface
    100.
    发明申请
    Administering Truncated Receive Functions In A Parallel Messaging Interface 有权
    管理并行消息接口中的截断接收功能

    公开(公告)号:US20120079035A1

    公开(公告)日:2012-03-29

    申请号:US12892153

    申请日:2010-09-28

    IPC分类号: G06F15/16

    CPC分类号: G06F15/16

    摘要: Administering truncated receive functions in a parallel messaging interface (‘PMI’) of a parallel computer comprising a plurality of compute nodes coupled for data communications through the PMI and through a data communications network, including: sending, through the PMI on a source compute node, a quantity of data from the source compute node to a destination compute node; specifying, by an application on the destination compute node, a portion of the quantity of data to be received by the application on the destination compute node and a portion of the quantity of data to be discarded; receiving, by the PMI on the destination compute node, all of the quantity of data; providing, by the PMI on the destination compute node to the application on the destination compute node, only the portion of the quantity of data to be received by the application; and discarding, by the PMI on the destination compute node, the portion of the quantity of data to be discarded.

    摘要翻译: 在并行计算机的并行消息接口(“PMI”)中管理截断的接收功能,所述并行计算机包括多个计算节点,所述计算节点被耦合用于通过所述PMI并通过数据通信网络进行数据通信,包括:通过源计算节点 ,从源计算节点到目的地计算节点的数据量; 由目的地计算节点上的应用指定要由目标计算节点上的应用接收的数据量的一部分和待丢弃的数据量的一部分; 由目标计算节点上的PMI接收所有数据量; 由目的地计算节点上的PMI向目标计算节点上的应用提供应用程序要接收的数据量的部分; 并且由目的地计算节点上的PMI舍弃待丢弃的数据量的部分。