Executing Multiple Instructions Multiple Data (‘MIMD’) programs on a Single Instruction Multiple Data (‘SIMD’) machine
    1.
    发明授权
    Executing Multiple Instructions Multiple Data (‘MIMD’) programs on a Single Instruction Multiple Data (‘SIMD’) machine 失效
    在单指令多数据(“SIMD”)机器上执行多指令多数据('MIMD')程序

    公开(公告)号:US07831802B2

    公开(公告)日:2010-11-09

    申请号:US11780072

    申请日:2007-07-19

    IPC分类号: G06F15/76

    CPC分类号: G06F15/161

    摘要: Executing Multiple Instructions Multiple Data (‘MIMD’) programs on a Single Instruction Multiple Data (‘SIMD’) machine, the SIMD machine including a plurality of compute nodes, each compute node capable of executing only a single thread of execution, the compute nodes initially configured exclusively for SIMD operations, the SIMD machine further comprising a data communications network, the network comprising synchronous data communications links among the compute nodes, including establishing a SIMD partition comprising a plurality of the compute nodes; booting the SIMD partition in MIMD mode; executing by launcher programs a plurality of MIMD programs on compute nodes in the SIMD partition; and re-executing a launcher program by an operating system on a compute node in the SIMD partition upon termination of the MIMD program executed by the launcher program.

    摘要翻译: 在单指令多数据(“SIMD”)机器上执行多指令多数据(“MIMD”)程序,SIMD机器包括多个计算节点,每个计算节点只能执行单个执行线程,计算节点 最初被配置为专用于SIMD操作,所述SIMD机器还包括数据通信网络,所述网络包括所述计算节点之间的同步数据通信链路,包括建立包括多个所述计算节点的SIMD分区; 以MIMD模式引导SIMD分区; 通过启动程序执行SIMD分区中的计算节点上的多个MIMD程序; 以及在由所述启动程序执行的所述MIMD程序终止时,由所述SIMD分区中的计算节点上的操作系统重新执行启动程序。

    Executing Multiple Instructions Multiple Data ('MIMD') Programs on a Single Instruction Multiple Data ('SIMD') Machine
    2.
    发明申请
    Executing Multiple Instructions Multiple Data ('MIMD') Programs on a Single Instruction Multiple Data ('SIMD') Machine 失效
    在单指令多数据(“SIMD”)机器上执行多指令多数据('MIMD')程序

    公开(公告)号:US20090024830A1

    公开(公告)日:2009-01-22

    申请号:US11780072

    申请日:2007-07-19

    IPC分类号: G06F15/00

    CPC分类号: G06F15/161

    摘要: Executing Multiple Instructions Multiple Data (‘MIMD’) programs on a Single Instruction Multiple Data (‘SIMD’) machine, the SIMD machine including a plurality of compute nodes, each compute node capable of executing only a single thread of execution, the compute nodes initially configured exclusively for SIMD operations, the SIMD machine further comprising a data communications network, the network comprising synchronous data communications links among the compute nodes, including establishing a SIMD partition comprising a plurality of the compute nodes; booting the SIMD partition in MIMD mode; executing by launcher programs a plurality of MIMD programs on compute nodes in the SIMD partition; and re-executing a launcher program by an operating system on a compute node in the SIMD partition upon termination of the MIMD program executed by the launcher program.

    摘要翻译: 在单指令多数据(“SIMD”)机器上执行多指令多数据(“MIMD”)程序,SIMD机器包括多个计算节点,每个计算节点只能执行单个执行线程,计算节点 最初被配置为专用于SIMD操作,所述SIMD机器还包括数据通信网络,所述网络包括所述计算节点之间的同步数据通信链路,包括建立包括多个所述计算节点的SIMD分区; 以MIMD模式引导SIMD分区; 通过启动程序执行SIMD分区中的计算节点上的多个MIMD程序; 以及在由所述启动程序执行的所述MIMD程序终止时,由所述SIMD分区中的计算节点上的操作系统重新执行启动程序。

    Re-executing launcher program upon termination of launched programs in MIMD mode booted SIMD partitions
    3.
    发明授权
    Re-executing launcher program upon termination of launched programs in MIMD mode booted SIMD partitions 失效
    在MIMD模式启动程序终止后重新执行启动程序启动SIMD分区

    公开(公告)号:US07979674B2

    公开(公告)日:2011-07-12

    申请号:US11749397

    申请日:2007-05-16

    IPC分类号: G06F9/46

    CPC分类号: G06F9/5061

    摘要: Executing MIMD programs on a SIMD machine, the SIMD machine including a plurality of compute nodes, each compute node capable of executing only a single thread of execution, the compute nodes initially configured exclusively for SIMD operations, the SIMD machine further comprising a data communications network, the network comprising synchronous data communications links among the compute nodes, including establishing one or more SIMD partitions, booting one or more SIMD partitions in MIMD mode; establishing a MIMD partition; executing by launcher programs a plurality of MIMD programs on two or more of the compute nodes of the MIMD partition; and re-executing a launcher program by an operating system on a compute node in the MIMD partition upon termination of the MIMD program executed by the launcher program.

    摘要翻译: 在SIMD机器上执行MIMD程序,所述SIMD机器包括多个计算节点,每个计算节点仅能够执行单个执行线程,所述计算节点最初被配置为专用于SIMD操作,所述SIMD机器还包括数据通信网络 所述网络包括所述计算节点之间的同步数据通信链路,包括建立一个或多个SIMD分区,以MIMD模式引导一个或多个SIMD分区; 建立MIMD分区; 通过发射器程序在MIMD分区的两个或更多个计算节点上执行多个MIMD程序; 以及当由所述启动程序执行的MIMD程序终止时,由MIMD分区中的计算节点上的操作系统重新执行启动程序。

    Scaling and managing work requests on a massively parallel machine
    4.
    发明授权
    Scaling and managing work requests on a massively parallel machine 有权
    在大型并行机上扩展和管理工作请求

    公开(公告)号:US08918624B2

    公开(公告)日:2014-12-23

    申请号:US12121262

    申请日:2008-05-15

    IPC分类号: G06F9/30 G06F9/48 G06F9/38

    摘要: A method, computer program product and computer system for scaling and managing requests on a massively parallel machine, such as one running in MIMD mode on a SIMD machine. A submit mux (multiplexer) is used to federate work requests and to forward the requests to the management node. A resource arbiter receives and manges these work requests. A MIMD job controller works with the resource arbiter to manage the work requests on the SIMD partition. The SIMD partition may utilize a mux of its own to federate the work requests and the computer nodes. Instructions are also provided to control and monitor the work requests.

    摘要翻译: 一种用于在大型并行机器上缩放和管理请求的方法,计算机程序产品和计算机系统,例如在SIMD机器上以MIMD模式运行的请求。 提交多路复用器(Multiplexux)用于联合工作请求并将请求转发到管理节点。 资源仲裁器接收并管理这些工作请求。 MIMD作业控制器与资源仲裁器配合使用以管理SIMD分区上的工作请求。 SIMD分区可以利用其自己的多路复用器来联合工作请求和计算机节点。 还提供说明以控制和监视工作请求。

    DYNAMICALLY REASSIGNING A CONNECTED NODE TO A BLOCK OF COMPUTE NODES FOR RE-LAUNCHING A FAILED JOB
    5.
    发明申请
    DYNAMICALLY REASSIGNING A CONNECTED NODE TO A BLOCK OF COMPUTE NODES FOR RE-LAUNCHING A FAILED JOB 有权
    将连接的节点动态地重新连接到重新启动失败作业的电脑节目块

    公开(公告)号:US20120047393A1

    公开(公告)日:2012-02-23

    申请号:US12861426

    申请日:2010-08-23

    IPC分类号: G06F11/20

    CPC分类号: G06F11/2035 G06F11/203

    摘要: Methods, systems, and products for dynamically reassigning a connected node to a block of compute nodes for re-launching a failed job that include: identifying that a job failed to execute on the block of compute nodes because connectivity failed between a compute node assigned as at least one of the connected nodes for the block of compute nodes and its supporting I/O node; and re-launching the job, including selecting an alternative connected node that is actively coupled for data communications with an active I/O node; and assigning the alternative connected node as the connected node for the block of compute nodes running the re-launched job.

    摘要翻译: 方法,系统和产品用于动态重新分配连接的节点到计算节点块以重新启动失败的作业,包括:识别作业在计算节点块上执行失败,因为在分配为 至少一个用于计算节点块的连接节点及其支持的I / O节点; 并且重新启动该作业,包括选择主动耦合以与活动I / O节点进行数据通信的备选连接节点; 并且将替代连接的节点分配为用于运行重新启动的作业的计算节点的块的连接节点。

    Resource management on a computer system utilizing hardware and environmental factors
    6.
    发明授权
    Resource management on a computer system utilizing hardware and environmental factors 失效
    利用硬件和环境因素对计算机系统进行资源管理

    公开(公告)号:US08225324B2

    公开(公告)日:2012-07-17

    申请号:US12121096

    申请日:2008-05-15

    IPC分类号: G06F9/46

    摘要: A method for resource management on a computer system utilizing hardware and environmental information. A caller interacts with an application program interface to handle information requests with a persistent data storage device to combine information involving hardware resource information, environmental data and other system information, all both historical, present and predicted values. Application execution decisions may then made regarding hardware for the calling entity. The method may be implemented as a computer process.

    摘要翻译: 一种利用硬件和环境信息的计算机系统上的资源管理方法。 呼叫者与应用程序接口交互以利用持久性数据存储设备处理信息请求,以组合涉及硬件资源信息,环境数据和其他系统信息的信息,所有这些信息都包括历史,现在和预测值。 然后可以对呼叫实体的硬件进行应用执行决定。 该方法可以被实现为计算机进程。

    Dynamically reassigning a connected node to a block of compute nodes for re-launching a failed job
    7.
    发明授权
    Dynamically reassigning a connected node to a block of compute nodes for re-launching a failed job 有权
    将连接的节点动态地重新分配到一个计算节点块,以重新启动失败的作业

    公开(公告)号:US08140889B2

    公开(公告)日:2012-03-20

    申请号:US12861426

    申请日:2010-08-23

    IPC分类号: G06F11/00

    CPC分类号: G06F11/2035 G06F11/203

    摘要: Methods, systems, and products for dynamically reassigning a connected node to a block of compute nodes for re-launching a failed job that include: identifying that a job failed to execute on the block of compute nodes because connectivity failed between a compute node assigned as at least one of the connected nodes for the block of compute nodes and its supporting I/O node; and re-launching the job, including selecting an alternative connected node that is actively coupled for data communications with an active I/O node; and assigning the alternative connected node as the connected node for the block of compute nodes running the re-launched job.

    摘要翻译: 方法,系统和产品用于动态重新分配连接的节点到计算节点块以重新启动失败的作业,包括:识别作业在计算节点块上执行失败,因为在分配为 至少一个用于计算节点块的连接节点及其支持的I / O节点; 并且重新启动该作业,包括选择主动耦合以与活动I / O节点进行数据通信的备选连接节点; 并且将替代连接的节点分配为用于运行重新启动的作业的计算节点的块的连接节点。

    Resource Management on a Computer System Utilizing Hardware and Environmental Factors
    8.
    发明申请
    Resource Management on a Computer System Utilizing Hardware and Environmental Factors 失效
    计算机系统资源管理利用硬件和环境因素

    公开(公告)号:US20090288094A1

    公开(公告)日:2009-11-19

    申请号:US12121096

    申请日:2008-05-15

    IPC分类号: G06F9/46

    摘要: A method for resource management on a computer system utilizing hardware and environmental information. A caller interacts with an application program interface to handle information requests with a persistent data storage device to combine information involving hardware resource information, environmental data and other system information, all both historical, present and predicted values. Application execution decisions may then made regarding hardware for the calling entity. The method may be implemented as a computer process.

    摘要翻译: 一种利用硬件和环境信息的计算机系统上的资源管理方法。 呼叫者与应用程序接口交互以利用持久性数据存储设备处理信息请求,以组合涉及硬件资源信息,环境数据和其他系统信息的信息,所有这些信息都包括历史,现在和预测值。 然后可以对呼叫实体的硬件进行应用执行决定。 该方法可以被实现为计算机进程。

    Scaling and Managing Work Requests on a Massively Parallel Machine
    9.
    发明申请
    Scaling and Managing Work Requests on a Massively Parallel Machine 有权
    在大型并行机上扩展和管理工作请求

    公开(公告)号:US20090288085A1

    公开(公告)日:2009-11-19

    申请号:US12121262

    申请日:2008-05-15

    IPC分类号: G06F9/46

    摘要: A method, computer program product and computer system for scaling and managing requests on a massively parallel machine, such as one running in MIMD mode on a SIMD machine. A submit mux (multiplexer) is used to federate work requests and to forward the requests to the management node. A resource arbiter receives and manges these work requests. A MIMD job controller works with the resource arbiter to manage the work requests on the SIMD partition. The SIMD partition may utilize a mux of its own to federate the work requests and the computer nodes. Instructions are also provided to control and monitor the work requests.

    摘要翻译: 一种用于在大型并行机器上缩放和管理请求的方法,计算机程序产品和计算机系统,例如在SIMD机器上以MIMD模式运行的请求。 提交多路复用器(Multiplexux)用于联合工作请求并将请求转发到管理节点。 资源仲裁者接收并管理这些工作请求。 MIMD作业控制器与资源仲裁器配合使用以管理SIMD分区上的工作请求。 SIMD分区可以利用其自己的多路复用器来联合工作请求和计算机节点。 还提供说明以控制和监视工作请求。