Performing an allreduce operation using shared memory
    1.
    发明授权
    Performing an allreduce operation using shared memory 失效
    使用共享内存执行allreduce操作

    公开(公告)号:US08752051B2

    公开(公告)日:2014-06-10

    申请号:US13427057

    申请日:2012-03-22

    IPC分类号: G06F9/46 G06F9/48 G06F9/52

    CPC分类号: G06F9/4843 G06F9/52 G06F9/546

    摘要: Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.

    摘要翻译: 公开了用于使用共享存储器执行全部还原操作的方法,装置和产品,其包括:由计算节点上的多个处理核心中的至少一个接收执行全部降低操作的指令; 通过所述接收到所述指令的核心建立用于指定多个共享存储器全部还原工作单元的作业状态对象,所述多个共享存储器全部还原工作单元一起在所述计算节点上执行全部还原操作; 通过所述计算节点上的可用核确定所述作业状态对象中的下一个共享存储器allreduce工作单元; 并且通过计算节点上的可用核心执行下一个共享存储器allreduce工作单元。

    Performing An Allreduce Operation Using Shared Memory
    2.
    发明申请
    Performing An Allreduce Operation Using Shared Memory 失效
    使用共享内存执行Allreduce操作

    公开(公告)号:US20120179881A1

    公开(公告)日:2012-07-12

    申请号:US13427057

    申请日:2012-03-22

    IPC分类号: G06F12/02

    CPC分类号: G06F9/4843 G06F9/52 G06F9/546

    摘要: Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.

    摘要翻译: 公开了用于使用共享存储器执行全部还原操作的方法,装置和产品,其包括:由计算节点上的多个处理核心中的至少一个接收执行全部降低操作的指令; 通过所述接收到所述指令的核心建立用于指定多个共享存储器全部还原工作单元的作业状态对象,所述多个共享存储器全部还原工作单元一起在所述计算节点上执行全部还原操作; 通过所述计算节点上的可用核确定所述作业状态对象中的下一个共享存储器allreduce工作单元; 并且通过计算节点上的可用核心执行下一个共享存储器allreduce工作单元。

    Performing an allreduce operation using shared memory
    3.
    发明授权
    Performing an allreduce operation using shared memory 有权
    使用共享内存执行allreduce操作

    公开(公告)号:US08161480B2

    公开(公告)日:2012-04-17

    申请号:US11754782

    申请日:2007-05-29

    IPC分类号: G06F9/46 G06F13/00 G06F7/38

    CPC分类号: G06F9/4843 G06F9/52 G06F9/546

    摘要: Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.

    摘要翻译: 公开了用于使用共享存储器执行全部还原操作的方法,装置和产品,其包括:由计算节点上的多个处理核心中的至少一个接收执行全部降低操作的指令; 通过所述接收到所述指令的核心建立用于指定多个共享存储器全部还原工作单元的作业状态对象,所述多个共享存储器全部还原工作单元一起在所述计算节点上执行全部还原操作; 通过所述计算节点上的可用核确定所述作业状态对象中的下一个共享存储器allreduce工作单元; 并且通过计算节点上的可用核心执行下一个共享存储器allreduce工作单元。

    Performing an Allreduce Operation Using Shared Memory
    4.
    发明申请
    Performing an Allreduce Operation Using Shared Memory 有权
    使用共享内存执行Allreduce操作

    公开(公告)号:US20080301683A1

    公开(公告)日:2008-12-04

    申请号:US11754782

    申请日:2007-05-29

    IPC分类号: G06F9/46

    CPC分类号: G06F9/4843 G06F9/52 G06F9/546

    摘要: Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.

    摘要翻译: 公开了用于使用共享存储器执行全部还原操作的方法,装置和产品,其包括:由计算节点上的多个处理核心中的至少一个接收执行全部降低操作的指令; 通过所述接收到所述指令的核心建立用于指定多个共享存储器全部还原工作单元的作业状态对象,所述多个共享存储器全部还原工作单元一起在所述计算节点上执行全部还原操作; 通过所述计算节点上的可用核确定所述作业状态对象中的下一个共享存储器allreduce工作单元; 并且通过计算节点上的可用核心执行下一个共享存储器allreduce工作单元。

    Mechanism to support generic collective communication across a variety of programming models
    5.
    发明授权
    Mechanism to support generic collective communication across a variety of programming models 失效
    支持各种编程模型中的通用集体通信的机制

    公开(公告)号:US07984448B2

    公开(公告)日:2011-07-19

    申请号:US11768669

    申请日:2007-06-26

    IPC分类号: G06F9/44 G06F9/46 G06F15/76

    CPC分类号: G06F9/54

    摘要: A system and method for supporting collective communications on a plurality of processors that use different parallel programming paradigms, in one aspect, may comprise a schedule defining one or more tasks in a collective operation, an executor that executes the task, a multisend module to perform one or more data transfer functions associated with the tasks, and a connection manager that controls one or more connections and identifies an available connection. The multisend module uses the available connection in performing the one or more data transfer functions. A plurality of processors that use different parallel programming paradigms can use a common implementation of the schedule module, the executor module, the connection manager and the multisend module via a language adaptor specific to a parallel programming paradigm implemented on a processor.

    摘要翻译: 在一个方面,用于支持在使用不同的并行编程范例的多个处理器上的集体通信的系统和方法可以包括在集体操作中定义一个或多个任务的调度,执行该任务的执行器,执行多个模块的执行器 与任务相关联的一个或多个数据传送功能,以及控制一个或多个连接并识别可用连接的连接管理器。 多次模块在执行一个或多个数据传输功能时使用可用的连接。 使用不同的并行编程范例的多个处理器可以经由特定于在处理器上实现的并行编程范例的语言适配器来使用调度模块,执行器模块,连接管理器和多发模块的通用实现。

    Shared address collectives using counter mechanisms
    6.
    发明授权
    Shared address collectives using counter mechanisms 失效
    共享地址集合使用计数器机制

    公开(公告)号:US08655962B2

    公开(公告)日:2014-02-18

    申请号:US12568115

    申请日:2009-09-28

    IPC分类号: G06F15/16 G06F15/167

    CPC分类号: G06F9/544

    摘要: A shared address space on a compute node stores data received from a network and data to transmit to the network. The shared address space includes an application buffer that can be directly operated upon by a plurality of processes, for instance, running on different cores on the compute node. A shared counter is used for one or more of signaling arrival of the data across the plurality of processes running on the compute node, signaling completion of an operation performed by one or more of the plurality of processes, obtaining reservation slots by one or more of the plurality of processes, or combinations thereof.

    摘要翻译: 计算节点上的共享地址空间存储从网络接收的数据和要发送到网络的数据。 共享地址空间包括可以通过多个进程直接操作的应用缓冲器,例如在计算节点上的不同核上运行。 共享计数器用于通过在计算节点上运行的多个进程的信令到达的一个或多个,信令完成由多个进程中的一个或多个执行的操作,通过一个或多个 多个处理或其组合。

    SHARED ADDRESS COLLECTIVES USING COUNTER MECHANISMS
    7.
    发明申请
    SHARED ADDRESS COLLECTIVES USING COUNTER MECHANISMS 失效
    使用计数器机制的共享地址集合

    公开(公告)号:US20110078249A1

    公开(公告)日:2011-03-31

    申请号:US12568115

    申请日:2009-09-28

    IPC分类号: G06F15/16

    CPC分类号: G06F9/544

    摘要: A shared address space on a compute node stores data received from a network and data to transmit to the network. The shared address space includes an application buffer that can be directly operated upon by a plurality of processes, for instance, running on different cores on the compute node. A shared counter is used for one or more of signaling arrival of the data across the plurality of processes running on the compute node, signaling completion of an operation performed by one or more of the plurality of processes, obtaining reservation slots by one or more of the plurality of processes, or combinations thereof.

    摘要翻译: 计算节点上的共享地址空间存储从网络接收的数据和要发送到网络的数据。 共享地址空间包括可以通过多个进程直接操作的应用缓冲器,例如在计算节点上的不同核上运行。 共享计数器用于通过在计算节点上运行的多个进程的信令到达的一个或多个,信令完成由多个进程中的一个或多个执行的操作,通过一个或多个 多个处理或其组合。

    MECHANISM TO SUPPORT GENERIC COLLECTIVE COMMUNICATION ACROSS A VARIETY OF PROGRAMMING MODELS
    8.
    发明申请
    MECHANISM TO SUPPORT GENERIC COLLECTIVE COMMUNICATION ACROSS A VARIETY OF PROGRAMMING MODELS 失效
    通过各种编程模式支持通用集体交流的机制

    公开(公告)号:US20090006810A1

    公开(公告)日:2009-01-01

    申请号:US11768669

    申请日:2007-06-26

    IPC分类号: G06F15/00

    CPC分类号: G06F9/54

    摘要: A system and method for supporting collective communications on a plurality of processors that use different parallel programming paradigms, in one aspect, may comprise a schedule defining one or more tasks in a collective operation an executor that executes the task, a multisend module to perform one or more data transfer functions associated with the tasks, and a connection manager that controls one or more connections and identifies an available connection. The multisend module uses the available connection in performing the one or more data transfer functions. A plurality of processors that use different parallel programming paradigms can use a common implementation of the schedule module, the executor module, the connection manager and the multisend module via a language adaptor specific to a parallel programming paradigm implemented on a processor.

    摘要翻译: 在一个方面,用于在使用不同的并行编程范例的多个处理器上支持集体通信的系统和方法可以包括在集体操作中定义执行任务的执行器中的一个或多个任务的调度,执行一个执行器的多发模块 或更多数据传输功能,以及连接管理器,其控制一个或多个连接并识别可用连接。 多次模块在执行一个或多个数据传输功能时使用可用的连接。 使用不同的并行编程范例的多个处理器可以经由特定于在处理器上实现的并行编程范例的语言适配器来使用调度模块,执行器模块,连接管理器和多发模块的通用实现。