SYSTEM AND METHOD OF RECOVERING FROM FAILURES IN A VIRTUAL MACHINE
    1.
    发明申请
    SYSTEM AND METHOD OF RECOVERING FROM FAILURES IN A VIRTUAL MACHINE 有权
    虚拟机故障恢复系统及方法

    公开(公告)号:US20080307259A1

    公开(公告)日:2008-12-11

    申请号:US11759099

    申请日:2007-06-06

    IPC分类号: G06F11/00

    摘要: A method and systems for recovering from a failure in a virtual machine are provided. In accordance with one embodiment of the present disclosure, a method for recovering from failures in a virtual machine is provided. The method may include, in a first physical host having a host operating system and a virtual machine running on the host operating system, monitoring one or more parameters associated with a program running on the virtual machine, each parameter having a predetermined acceptable range. The method may further include determining if the one or more parameters are within their respective predetermined acceptable ranges. In response to determining that the one or more parameters associated with the program running on the virtual machine are not within their respective predetermined acceptable ranges, a management module may cause the application running on the virtual machine to be restarted.

    摘要翻译: 提供了用于从虚拟机中的故障中恢复的方法和系统。 根据本公开的一个实施例,提供了一种用于从虚拟机中的故障中恢复的方法。 该方法可以包括在具有主机操作系统和在主机操作系统上运行的虚拟机的第一物理主机中监视与在虚拟机上运行的程序相关联的一个或多个参数,每个参数具有预定的可接受范围。 该方法还可以包括确定一个或多个参数是否在它们各自的预定可接受范围内。 响应于确定与在虚拟机上运行的程序相关联的一个或多个参数不在其各自的预定可接受范围内,管理模块可以使得在虚拟机上运行的应用程序重新启动。

    Method, system and software for allocating information handling system resources in response to high availability cluster fail-over events
    2.
    发明申请
    Method, system and software for allocating information handling system resources in response to high availability cluster fail-over events 审中-公开
    用于响应高可用性集群故障转移事件分配信息处理系统资源的方法,系统和软件

    公开(公告)号:US20050132379A1

    公开(公告)日:2005-06-16

    申请号:US10733796

    申请日:2003-12-11

    IPC分类号: G06F9/46 G06F9/50 G06F11/20

    摘要: A method, system and software for allocating information handling system resources in response to cluster fail-over events are disclosed. In operation, the method provides for the calculation of a performance ratio between a failing node and a fail-over node and the transformation of an application calendar schedule from the failing node into a new application calendar schedule for the fail-over node. Before implementing the new application calendar schedule for the failing-over application on the fail-over node, the method verifies that the fail-over node includes sufficient resources to process its existing calendar schedule as well as the new application calendar schedule for the failing-over application. A resource negotiation algorithm may be applied to one or more of the calendar schedules to prevent application starvation in the event the fail-over node does not include sufficient resources to process the failing-over application calendar schedule as well as its existing application calendar schedules.

    摘要翻译: 公开了一种响应群集故障转移事件分配信息处理系统资源的方法,系统和软件。 在运行中,该方法提供了故障节点与故障转移节点之间的性能比的计算以及将应用程序日程表从故障节点转换为故障切换节点的新应用程序日程表。 在为故障切换节点上的故障转移应用程序实施新应用程序日历计划之前,该方法将验证故障转移节点包含足够的资源来处理其现有日历计划以及故障切换节点的新应用程序日程表, 超过申请。 资源协商算法可以应用于一个或多个日历计划,以防止在故障转移节点不包括足够的资源来处理故障转移应用程序日程安排以及其现有的应用程序日历计划的情况下的应用程序缺陷。

    Concurrent access to RAID data in shared storage
    3.
    发明申请
    Concurrent access to RAID data in shared storage 审中-公开
    在共享存储中并发访问RAID数据

    公开(公告)号:US20060129559A1

    公开(公告)日:2006-06-15

    申请号:US11012586

    申请日:2004-12-15

    IPC分类号: G06F17/30

    摘要: A system and method is disclosed for managing the serving of read and write commands in a computer cluster system having redundant storage. A plurality of database servers is included in the computer cluster network to serve read and write commands from the database clients of the network. One of the database servers is configured to handle both read commands and write commands. The remainder of the database servers are configured to handle only read commands. The database of the computer system includes a redundant storage subsystem that involves the use of mirrored disks associated with each of the database servers.

    摘要翻译: 公开了一种用于在具有冗余存储器的计算机集群系统中管理读取和写入命令的服务的系统和方法。 多个数据库服务器被包括在计算机集群网络中以从网络的数据库客户端提供读写命令。 其中一个数据库服务器被配置为处理读命令和写命令。 其余数据库服务器被配置为仅处理读取命令。 计算机系统的数据库包括冗余存储子系统,其涉及使用与每个数据库服务器相关联的镜像磁盘。

    System and method of recovering from failures in a virtual machine
    4.
    发明授权
    System and method of recovering from failures in a virtual machine 有权
    从虚拟机故障中恢复的系统和方法

    公开(公告)号:US07797587B2

    公开(公告)日:2010-09-14

    申请号:US11759099

    申请日:2007-06-06

    IPC分类号: G06F11/00

    摘要: A method and systems for recovering from a failure in a virtual machine are provided. In accordance with one embodiment of the present disclosure, a method for recovering from failures in a virtual machine is provided. The method may include, in a first physical host having a host operating system and a virtual machine running on the host operating system, monitoring one or more parameters associated with a program running on the virtual machine, each parameter having a predetermined acceptable range. The method may further include determining if the one or more parameters are within their respective predetermined acceptable ranges. In response to determining that the one or more parameters associated with the program running on the virtual machine are not within their respective predetermined acceptable ranges, a management module may cause the application running on the virtual machine to be restarted.

    摘要翻译: 提供了用于从虚拟机中的故障中恢复的方法和系统。 根据本公开的一个实施例,提供了一种用于从虚拟机中的故障中恢复的方法。 该方法可以包括在具有主机操作系统和在主机操作系统上运行的虚拟机的第一物理主机中监视与在虚拟机上运行的程序相关联的一个或多个参数,每个参数具有预定的可接受范围。 该方法还可以包括确定一个或多个参数是否在它们各自的预定可接受范围内。 响应于确定与在虚拟机上运行的程序相关联的一个或多个参数不在其各自的预定可接受范围内,管理模块可以使得在虚拟机上运行的应用程序重新启动。

    Adaptive input/output bus sharing method to improve performance of SCSI/SAS clusters
    5.
    发明申请
    Adaptive input/output bus sharing method to improve performance of SCSI/SAS clusters 有权
    自适应输入/输出总线共享方法,以提高SCSI / SAS集群的性能

    公开(公告)号:US20070022189A1

    公开(公告)日:2007-01-25

    申请号:US11186529

    申请日:2005-07-21

    IPC分类号: G06F15/173

    CPC分类号: H04L67/1097

    摘要: According to various illustrative embodiments of the present invention, a method for adaptive cluster input/output control includes starting a nonessential input/output operation using a first controller in a first node of a cluster, informing at least a second controller in a second node of the cluster about starting the nonessential input/output operation, and increasing the nonessential input/output operation by a predetermined utilization percentage. The method also includes waiting for a predetermined amount of time, determining whether the nonessential input/output operation has been completed, and determining whether the at least the second controller in the second node has substantially decreased performance. The method also includes decreasing the nonessential input/output operation by the predetermined utilization percentage if the nonessential input/output operation utilization percentage is greater than the predetermined utilization percentage and if the at least the second controller in the second node has substantially decreased performance, and informing the at least the second controller in the second node of the cluster about the completion of the nonessential input/output operation if the nonessential input/output operation has been completed.

    摘要翻译: 根据本发明的各种说明性实施例,一种用于自适应群集输入/输出控制的方法包括使用群集的第一节点中的第一控制器来开始非必要的输入/输出操作,通知第二节点中的至少第二控制器 关于开始非必要的输入/输出操作的集群,以及以非预定的利用率增加非必要的输入/输出操作。 该方法还包括等待预定量的时间,确定非必需输入/输出操作是否已经完成,以及确定第二节点中的至少第二控制器是否具有显着降低的性能。 该方法还包括如果非必需输入/输出操作利用百分比大于预定利用百分比并且如果第二节点中的至少第二控制器具有显着降低的性能,则将非必需输入/输出操作减小预定利用百分比,以及 如果非必要的输入/输出操作已经完成,通知集群的第二节点中的第二控制器关于非必要的输入/输出操作的完成。

    Distributed failover aware storage area network backup of application data in an active-N high availability cluster
    6.
    发明申请
    Distributed failover aware storage area network backup of application data in an active-N high availability cluster 有权
    分布式故障切换感知存储区域网络备份主动N高可用性集群中的应用程序数据

    公开(公告)号:US20050149684A1

    公开(公告)日:2005-07-07

    申请号:US10748634

    申请日:2003-12-30

    IPC分类号: G06F12/16

    CPC分类号: G06F11/1458

    摘要: A SAN-based cluster backup system and method are provided. The system and method are automated, do not use a LAN for backup data, and are made aware of application failover events. The system and method are composed of two main components: a backup service, and a primary coordinator. The backup service performs the backup of the applications that are hosted on a particular node. The backup service periodically checkpoints the state of the backup job and communicates the status to the primary coordinator. The primary coordinator controls all backup operations in the cluster. The user submits backup jobs for the applications through the primary coordinator. If a node fails during a backup operation, the primary coordinator can ensure that the failed backup job can be resumed from the last checkpoint on the failed-over node. In this way, repetitive backups can be avoided, thereby increasing efficiency.

    摘要翻译: 提供了基于SAN的集群备份系统和方法。 系统和方法是自动化的,不要使用局域网进行备份数据,并且了解应用程序故障切换事件。 系统和方法由两个主要组件组成:备份服务和主协调器。 备份服务执行托管在特定节点上的应用程序的备份。 备份服务定期检查备份作业的状态,并将状态传达给主协调器。 主协调器控制集群中的所有备份操作。 用户通过主协调器提交应用程序的备份作业。 如果在备份操作期间节点出现故障,则主协调器可以确保可以从故障切换节点上的最后一个检查点恢复发生故障的备份作业。 这样可以避免重复的备份,从而提​​高效率。