Proactive high availability in a virtualized computer system

    公开(公告)号:US10430248B2

    公开(公告)日:2019-10-01

    申请号:US14751856

    申请日:2015-06-26

    Applicant: VMware, Inc.

    Abstract: A method of managing virtual resources executing on a hardware platform that employs sensors to monitor the health of hardware resources of the hardware platform, includes filtering sensor data from the hardware platform and combining the sensor data with a fault model for the hardware platform to generate a health score, receiving an inventory that maps the virtual resources to the hardware resources of the hardware platform, receiving resource usage data describing use of the hardware resources of the hardware platform by the virtual resources, and generating resource utilization metrics from the resource usage data. The method includes receiving policy data specifying rules applicable to the inventory, determining a set of recommendations for changes to the inventory based on the health score, the resource usage data, and the policy data, and executing at least one recommendation to implement the changes to the inventory.

    Interdependent virtual machine management

    公开(公告)号:US10162661B2

    公开(公告)日:2018-12-25

    申请号:US14958641

    申请日:2015-12-03

    Applicant: VMware, Inc.

    Abstract: Exemplary methods, apparatuses, and systems determine a list of virtual machines to be subject to a corrective action. When one or more of the listed virtual machines have dependencies upon other virtual machines, network connections, or storage devices, the determination of the list includes determining that the dependencies of the one or more virtual machines have been met. An attempt to restart or take another corrective action for the first virtual machine within the list is made. A second virtual machine that is currently deployed and running or powered off or paused in response to the corrective action for the first virtual machine is determined to be dependent upon the first virtual machine. In response to the second virtual machine's dependencies having been met by the attempt to restart or take corrective action for the first virtual machine, the second virtual machine is added to the list of virtual machines.

    PROTECTING VIRTUAL COMPUTING INSTANCES
    15.
    发明申请
    PROTECTING VIRTUAL COMPUTING INSTANCES 审中-公开
    保护虚拟计算机实现

    公开(公告)号:US20170003992A1

    公开(公告)日:2017-01-05

    申请号:US14755458

    申请日:2015-06-30

    Applicant: VMware, Inc.

    Abstract: The present disclosure is related to systems and methods for protecting virtual computing instances. An example system can include a first virtual computing instance (VCI) deployed on a hypervisor and provisioned with a pool of physical computing resources. The hypervisor and the first VCI can operate according to a first configuration profile. The system can include a fault domain manager (FDM) running on a second VCI that is deployed on the hypervisor and provisioned by the pool of physical computing resources. The FDM can be configured to provide high availability support for the first VCI, and the FDM can operate according to a second configuration profile. The system can further include a hypervisor manager running on the second VCI. The hypervisor manager can be configured to facilitate interaction between the FDM and the hypervisor by translating between the first configuration profile and the second configuration profile.

    Abstract translation: 本公开涉及用于保护虚拟计算实例的系统和方法。 示例系统可以包括部署在管理程序上并且被提供有物理计算资源池的第一虚拟计算实例(VCI)。 管理程序和第一VCI可以根据第一配置简档操作。 系统可以包括在第二VCI上运行的故障域管理器(FDM),其部署在虚拟机管理程序上并由物理计算资源池提供。 FDM可以被配置为为第一VCI提供高可用性支持,并且FDM可以根据第二配置简档来操作。 该系统还可以包括在第二VCI上运行的管理程序管理程序。 管理程序管理器可以被配置为通过在第一配置简档和第二配置简档之间进行翻译来促进FDM与管理程序之间的交互。

    Virtual Machine Recovery On Non-Shared Storage in a Single Virtual Infrastructure Management Instance
    16.
    发明申请
    Virtual Machine Recovery On Non-Shared Storage in a Single Virtual Infrastructure Management Instance 审中-公开
    在单个虚拟基础设施管理实例中的非共享存储上的虚拟机恢复

    公开(公告)号:US20160378622A1

    公开(公告)日:2016-12-29

    申请号:US14753817

    申请日:2015-06-29

    Applicant: VMware, Inc.

    Abstract: Techniques for enabling virtual machine (VM) recovery on non-shared storage in a single virtual infrastructure management server (VIMS) instance are provided. In one set of embodiments, a VIMS instance can receive an indication that a VM in a first cluster of the VIMS instance has failed, and can determine whether the VM's files were being replicated to a storage component of the VIMS instance at the time of the VM's failure. If the VM's files were being replicated at the time of the failure, the VIMS instance can search for and identify a cluster of the VIMS instance and a host system within the cluster that (1) are compatible with the VM, and (2) have access to the storage component. The VIMS instance can then cause the VM to be restarted on the identified host system of the identified cluster.

    Abstract translation: 提供了在单个虚拟基础设施管理服务器(VIMS)实例中的非共享存储上启用虚拟机(VM)恢复的技术。 在一组实施例中,VIMS实例可以接收VIMS实例的第一簇中的VM失败的指示,并且可以确定VM的文件是否正在复制到VIMS实例的存储组件 VM的失败。 如果虚拟机的文件在发生故障时被复制,则VIMS实例可以搜索并识别VIMS实例的集群和(1)与VM兼容的集群内的主机系统,以及(2)具有 访问存储组件。 然后,VIMS实例可以使VM在所识别的集群的所标识的主机系统上重新启动。

    Maintaining high availability during network partitions for virtual machines stored on distributed object-based storage
    17.
    发明授权
    Maintaining high availability during network partitions for virtual machines stored on distributed object-based storage 有权
    在分布式对象存储上存储的虚拟机的网络分区期间维持高可用性

    公开(公告)号:US09513946B2

    公开(公告)日:2016-12-06

    申请号:US14317712

    申请日:2014-06-27

    Applicant: VMware, Inc.

    CPC classification number: G06F9/45558 G06F9/542 G06F2009/4557

    Abstract: Techniques are disclosed for maintaining high availability (HA) for virtual machines (VMs) running on host systems of a host cluster, where each host system executes a HA module in a plurality of HA modules and a storage module in a plurality of storage modules, where the host cluster aggregates, via the plurality of storage modules, locally-attached storage resources of the host systems to provide an object store, where persistent data for the VMs is stored as per-VM storage objects across the locally-attached storage resources comprising the object store, and where a failure causes the plurality of storage modules to observe a network partition in the host cluster that the plurality of HA modules do not. In one embodiment, a host system in the host cluster executing a first HA module invokes an API exposed by the plurality of storage modules for persisting metadata for a VM to the object store. If the API is not processed successfully, the host system: (1) identifies a subset of second HA modules in the plurality of HA modules; (2) issues an accessibility query for the VM to the subset of second HA modules in parallel, the accessibility query being configured to determine whether the VM is accessible to the respective host systems of the subset of second HA modules; and (3) if at least one second HA module in the subset indicates that the VM is accessible to its respective host system, transmits a command to the at least one second HA module to invoke the API on its respective host system.

    Abstract translation: 公开了用于维护在主机集群的主机系统上运行的虚拟机(VM)的高可用性(HA)的技术,其中每个主机系统在多个HA模块中执行HA模块以及多个存储模块中的存储模块, 其中所述主机集群通过所述多个存储模块聚集所述主机系统的本地连接的存储资源以提供对象存储,其中所述VM的持久数据被存储为跨所述本地连接的存储资源的每个VM存储对象,包括 对象存储,以及故障导致多个存储模块观察主机集群中的多个HA模块没有的网络分区。 在一个实施例中,执行第一HA模块的主机集群中的主机系统调用由多个存储模块公开的API,用于将VM的元数据持久保存到对象存储。 如果API未被成功处理,则主机系统:(1)识别多个HA模块中的第二HA模块的子集; (2)并行地向所述第二HA模块的子集发起对所述VM的辅助性查询,所述辅助功能查询被配置为确定所述VM是否可访问所述第二HA模块子集的相应主机系统; 以及(3)如果所述子集中的至少一个第二HA模块指示所述VM可由其相应的主机系统访问,则向所述至少一个第二HA模块发送命令以在其相应主机系统上调用所述API。

    Persisting High Availability Protection State for Virtual Machines Stored on Distributed Object-Based Storage
    19.
    发明申请
    Persisting High Availability Protection State for Virtual Machines Stored on Distributed Object-Based Storage 有权
    坚持存储在基于分布式对象的存储上的虚拟机的高可用性保护状态

    公开(公告)号:US20150378857A1

    公开(公告)日:2015-12-31

    申请号:US14317637

    申请日:2014-06-27

    Applicant: VMware, Inc.

    Abstract: Techniques are disclosed for persisting high availability (HA) protection state for virtual machines (VMs) running on host systems of a host cluster, where the host cluster aggregates locally-attached storage resources of the host systems to provide an object store, and where persistent data for the VMs is stored as per-VM storage objects across the locally-attached storage resources comprising the object store. In one embodiment, a host system in the host cluster executing a HA module determines an identity of a VM that has been powered-on in the host cluster. The host system then persists HA protection state for the VM in a storage object of the VM, where the HA protection state indicates that the VM should be restarted on an active host system in the case of a failure in the host cluster.

    Abstract translation: 公开了用于为在主机集群的主机系统上运行的虚拟机(VM)持续存在高可用性(HA)保护状态的技术,其中主机集群聚集主机系统的本地连接的存储资源以提供对象存储,并且其中持久性 VM的数据按照每个VM存储对象存储在包括对象存储的本地连接的存储资源中。 在一个实施例中,执行HA模块的主机集群中的主机系统确定主机集群中已经通电的VM的身份。 然后,主机系统将在VM的存储对象中为VM维护HA保护状态,其中HA保护状态指示在主机集群中发生故障时应在活动主机系统上重新启动VM。

Patent Agency Ranking