Systems and Methods for Creating and Using a Data Structure for Parallel Programming

    公开(公告)号:US20170124020A1

    公开(公告)日:2017-05-04

    申请号:US15293413

    申请日:2016-10-14

    Abstract: System and method embodiments are provided for creating data structure for parallel programming. A method for creating data structures for parallel programming includes forming, by one or more processors, one or more data structures, each data structure comprising one or more global containers and a plurality of local containers. Each of the global containers is accessible by all of a plurality of threads in a multi-thread parallel processing environment. Each of the plurality of local containers is accessible only by a corresponding one of the plurality of threads. A global container is split into a second plurality of local containers when items are going to be processed in parallel and two or more local containers are merged into a single global container when a parallel process reaches a synchronization point.

    Collective operation management in a parallel computer
    34.
    发明授权
    Collective operation management in a parallel computer 有权
    并行计算机中的集体操作管理

    公开(公告)号:US09571329B2

    公开(公告)日:2017-02-14

    申请号:US13793895

    申请日:2013-03-11

    Abstract: Methods, apparatuses, and computer program products for collective operation management in a parallel computer are provided. Embodiments include a parallel computer having a first compute node operatively coupled for data communications over a tree data communications network with a plurality of child compute nodes. Embodiments also include each child compute node performing a first collective operation. The first compute rode, for each child compute node, receives from the child compute node, a result of the first collective operation performed by the child compute node. For each result received from a child compute node, the first compute node stores a timestamp indicating a time that the child compute node completed the first collective operation. The first compute node also manages, based on the stored timestamps, execution of a second collective operation over the tree data communications network.

    Abstract translation: 提供并行计算机中集体运行管理的方法,装置和计算机程序产品。 实施例包括具有第一计算节点的并行计算机,该第一计算节点可操作地耦合以用于通过树数据通信网络与多个子计算节点进行数据通信。 实施例还包括执行第一集合操作的每个子计算节点。 对于每个子计算节点,第一计算路径从子计算节点接收由子计算节点执行的第一集合操作的结果。 对于从子计算节点接收的每个结果,第一计算节点存储指示子计算节点完成第一集合操作的时间的时间戳。 第一计算节点还基于所存储的时间戳管理在树数据通信网络上执行第二集合操作。

    SYSTEM, METHOD, AND STORAGE MEDIUM
    35.
    发明申请
    SYSTEM, METHOD, AND STORAGE MEDIUM 有权
    系统,方法和存储介质

    公开(公告)号:US20160335083A1

    公开(公告)日:2016-11-17

    申请号:US15083672

    申请日:2016-03-29

    Inventor: Masahiro Miwa

    CPC classification number: G06F9/3001 G06F15/17318 G06F15/17331

    Abstract: A system includes a plurality of arithmetic devices configured to execute arithmetic processes in parallel. Each of plurality of arithmetic devices is configured to: determine whether a time period from the start of collective communication to reception from another arithmetic device involved in the collective communication is equal to or shorter than a predetermined threshold, determine a target arithmetic device that is among the plurality of arithmetic devices and for which a waiting scheme involved in the collective communication is to be changed when the time period is determined to be equal to or shorter than the predetermined threshold, and transmit, to the target arithmetic device, an instruction to change the waiting scheme involved in the collective communication.

    Abstract translation: 一种系统包括被配置成并行执行算术处理的多个算术装置。 多个运算装置中的每一个被配置为:确定从集体通信中涉及的另一个运算装置的集体通信开始到接收的时间段是否等于或小于预定阈值,确定其中的目标运算装置 当确定该时间段等于或小于该预定阈值时,多个运算装置和其中涉及集体通信的等待方案将被改变,并向目标运算装置发送改变的指令 参与集体交流的等待计划。

    Collective communications apparatus and method for parallel systems
    36.
    发明授权
    Collective communications apparatus and method for parallel systems 有权
    用于并行系统的集体通信设备和方法

    公开(公告)号:US09477628B2

    公开(公告)日:2016-10-25

    申请号:US14040676

    申请日:2013-09-28

    CPC classification number: G06F13/4068 G06F9/52 G06F15/17318 G06F15/17325

    Abstract: A collective communication apparatus and method for parallel computing systems. For example, one embodiment of an apparatus comprises a plurality of processor elements (PEs); collective interconnect logic to dynamically form a virtual collective interconnect (VCI) between the PEs at runtime without global communication among all of the PEs, the VCI defining a logical topology between the PEs in which each PE is directly communicatively coupled to a only a subset of the remaining PEs; and execution logic to execute collective operations across the PEs, wherein one or more of the PEs receive first results from a first portion of the subset of the remaining PEs, perform a portion of the collective operations, and provide second results to a second portion of the subset of the remaining PEs.

    Abstract translation: 一种用于并行计算系统的集体通信装置和方法。 例如,设备的一个实施例包括多个处理器元件(PE); 集体互连逻辑以在运行时动态地在PE之间形成虚拟集体互连(VCI),而不在所有PE之间进行全局通信,VCI在PE之间定义逻辑拓扑,其中每个PE直接通信地耦合到仅一个子集 余下的PE; 以及用于在所述PE之间执行集合操作的执行逻辑,其中所述PE中的一个或多个从所述剩余PE的子集的第一部分接收到第一结果,执行所述集体操作的一部分,并且将第二结果提供给 其余PE的子集。

    Data communications for a collective operation in a parallel active messaging interface of a parallel computer
    37.
    发明授权
    Data communications for a collective operation in a parallel active messaging interface of a parallel computer 有权
    用于并行计算机的并行活动消息接口中的集体操作的数据通信

    公开(公告)号:US09189447B2

    公开(公告)日:2015-11-17

    申请号:US13659458

    申请日:2012-10-24

    Inventor: Daniel A. Faraj

    CPC classification number: G06F15/17318 G06F9/54

    Abstract: Algorithm selection for data communications in a parallel active messaging interface (‘PAMI’) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI, including associating in the PAMI data communications algorithms and bit masks; receiving in an origin endpoint of the PAMI a collective instruction, the instruction specifying transmission of a data communications message from the origin endpoint to a target endpoint; constructing a bit mask for the received collective instruction; selecting, from among the associated algorithms and bit masks, a data communications algorithm in dependence upon the constructed bit mask; and executing the collective instruction, transmitting, according to the selected data communications algorithm from the origin endpoint to the target endpoint, the data communications message.

    Abstract translation: 并行计算机的并行主动消息接口(“PAMI”)中的数据通信的算法选择,由数据通信端点组成的PAMI,每个端点包括客户端的规范,上下文和任务,用于数据通信的端点 PAMI,包括将PAMI数据通信算法和位掩码相关联; 在PAMI的原点端点接收集体指令,指定数据通信消息从原点终端传输到目标端点的指令; 为接收到的集体指令构建位掩码; 从相关联的算法和位掩码中选择依赖于构造的位掩码的数据通信算法; 并且执行所述集体指令,根据所选择的数据通信算法从所述起始端点向所述目标端点发送所述数据通信消息。

    PARALLEL INFORMATION SYSTEM UTILIZING FLOW CONTROL AND VIRTUAL CHANNELS
    38.
    发明申请
    PARALLEL INFORMATION SYSTEM UTILIZING FLOW CONTROL AND VIRTUAL CHANNELS 有权
    平行信息系统利用流量控制和虚拟通道

    公开(公告)号:US20150188987A1

    公开(公告)日:2015-07-02

    申请号:US13297201

    申请日:2011-11-15

    Abstract: Embodiments of a data handling apparatus can include a network interface controller configured to interface a processing node to a network. The network interface controller can include a network interface, a register interface, a processing node interface, and logic. The network interface can include lines coupled to the network for communicating data on the network. The register interface can include lines coupled to multiple registers. The processing node interface can include at least one line coupled to the processing node for communicating data with a local processor local to the processing node wherein the local processor can read data to and write data from the registers. The logic can receive packets including a header and a payload from the network and can insert the packets into the registers as indicated by the header.

    Abstract translation: 数据处理装置的实施例可以包括被配置为将处理节点与网络接口的网络接口控制器。 网络接口控制器可以包括网络接口,寄存器接口,处理节点接口和逻辑。 网络接口可以包括耦合到网络的线路,用于在网络上传送数据。 寄存器接口可以包括耦合到多个寄存器的线。 处理节点接口可以包括耦合到处理节点的至少一个线路,用于与处理节点本地处理的本地处理器通信数据,其中本地处理器可以向寄存器读取数据和写入数据。 该逻辑可以从网络接收包括报头和有效载荷的分组,并且可以将报文插入到报头中指示的报文中。

    COLLECTIVE COMMUNICATIONS APPARATUS AND METHOD FOR PARALLEL SYSTEMS
    39.
    发明申请
    COLLECTIVE COMMUNICATIONS APPARATUS AND METHOD FOR PARALLEL SYSTEMS 有权
    集体通信装置和并行系统的方法

    公开(公告)号:US20150095542A1

    公开(公告)日:2015-04-02

    申请号:US14040676

    申请日:2013-09-28

    CPC classification number: G06F13/4068 G06F9/52 G06F15/17318 G06F15/17325

    Abstract: A collective communication apparatus and method for parallel computing systems. For example, one embodiment of an apparatus comprises a plurality of processor elements (PEs); collective interconnect logic to dynamically form a virtual collective interconnect (VCI) between the PEs at runtime without global communication among all of the PEs, the VCI defining a logical topology between the PEs in which each PE is directly communicatively coupled to a only a subset of the remaining PEs; and execution logic to execute collective operations across the PEs, wherein one or more of the PEs receive first results from a first portion of the subset of the remaining PEs, perform a portion of the collective operations, and provide second results to a second portion of the subset of the remaining PEs.

    Abstract translation: 一种用于并行计算系统的集体通信装置和方法。 例如,设备的一个实施例包括多个处理器元件(PE); 集体互连逻辑以在运行时动态地在PE之间形成虚拟集体互连(VCI),而不在所有PE之间进行全局通信,VCI在PE之间定义逻辑拓扑,其中每个PE直接通信地耦合到仅一个子集 余下的PE; 以及用于在所述PE之间执行集合操作的执行逻辑,其中所述PE中的一个或多个从所述剩余PE的子集的第一部分接收到第一结果,执行所述集体操作的一部分,并且将第二结果提供给 其余PE的子集。

    Distributed data scalable adaptive map-reduce framework
    40.
    发明授权
    Distributed data scalable adaptive map-reduce framework 有权
    分布式数据可扩展的自适应映射减少框架

    公开(公告)号:US08959138B2

    公开(公告)日:2015-02-17

    申请号:US13563990

    申请日:2012-08-01

    CPC classification number: G06F15/17318 G06F9/5066

    Abstract: A method for generating a distributed data scalable adaptive map-reduce framework for at least one multi-core cluster. The method includes partitioning a cluster into at least one computational group, determining at least one key-group leader within each computational group, performing a local combine operation at each computational group, performing a global combine operation at each of the at least one key-group leader within each computational group based on a result from the local combine operation, and performing a global map-reduce operation across the at least one key-group leader within each computational group.

    Abstract translation: 一种用于为至少一个多核心集群生成分布式数据可伸缩自适应地图缩减框架的方法。 该方法包括:将群集分成至少一个计算组,确定每个计算组内的至少一个密钥组领导,在每个计算组执行本地组合操作,在所述至少一个密钥组中的每一个执行全局组合操作, 基于来自本地组合操作的结果,并且在每个计算组内的至少一个密钥组引导件上执行全局映射减少操作,在每个计算组内的组长。

Patent Agency Ranking