Abstract:
The present disclosure describes a technique for honoring virtual machine placement constraints established on a first host implemented on a virtualized computing environment by receiving a request to migrate one or more virtual machines from the first host to a second host and without violating the virtual machine placement constraints, identifying an architecture of the first host, provisioning a second host with an architecture compatible with that of the first host, adding the second host to the cluster of hosts, and migrating the one or more virtual machines from the first host to the second host.
Abstract:
The subject matter described herein provides virtual computing instance (VCI) component protection against networking failures in a datacenter cluster. Networking routes at the host level, VCI level, and application level are monitored for connectivity. Failures are communicated to a primary host or to a datacenter virtualization infrastructure that initiates policy-based remediation, such as moving affected VCIs to another host in the cluster that has all the necessary networking routes functional.
Abstract:
The subject matter described herein is generally directed towards detection and remediation of virtual computing instance (VCI) failure on host devices. Monitoring is performed to detect suspected failures of different guest operating systems, identify failure information, and perform remediation to provide high availability for the VCI.
Abstract:
A system for monitoring a virtual machine executed on a host. The system includes a processor that receives an indication that a failure caused a storage device to be inaccessible to the virtual machine, the inaccessible storage device impacting an ability of the virtual machine to provide service, and applies a remedy to restore access to the storage device based on a type of the failure.
Abstract:
A system for proactive resource reservation for protecting virtual machines. The system includes a cluster of hosts, wherein the cluster of hosts includes a master host, a first slave host, and one or more other slave hosts, and wherein the first slave host executes one or more virtual machines thereon. The first slave host is configured to identify a failure that impacts an ability of the one or more virtual machines to provide service, and calculate a list of impacted virtual machines. The master host is configured to receive a request to reserve resources on another host in the cluster of hosts to enable the impacted one or more virtual machines to failover, calculate a resource capacity among the cluster of hosts, determine whether the calculated resource capacity is sufficient to reserve the resources, and send an indication as to whether the resources are reserved.
Abstract:
A system for monitoring virtual machines includes a master host and a slave host. The slave host includes a primary virtual machine and a secondary virtual machine. The slave host is configured to identify a failure that impacts an ability of at least one of the primary virtual machine and the secondary virtual machine to provide service. If the failure is a Permanent Device Loss failure, the slave host is configured to terminate each impacted virtual machine. If the failure is an All Paths Down failure, the master host is configured to apply one of the following: a first remedy if the primary virtual machine is impacted and the secondary virtual machine is not impacted; a second remedy if the secondary virtual machine is impacted and the primary virtual machine is not impacted; or a third remedy if both the primary virtual machine and the secondary virtual machine are impacted.
Abstract:
A system for proactive resource reservation for protecting virtual machines. The system includes a cluster of hosts, wherein the cluster of hosts includes a master host, a first slave host, and one or more other slave hosts, and wherein the first slave host executes one or more virtual machines thereon. The first slave host is configured to identify a failure that impacts an ability of the one or more virtual machines to provide service, and calculate a list of impacted virtual machines. The master host is configured to receive a request to reserve resources on another host in the cluster of hosts to enable the impacted one or more virtual machines to failover, calculate a resource capacity among the cluster of hosts, determine whether the calculated resource capacity is sufficient to reserve the resources, and send an indication as to whether the resources are reserved.
Abstract:
Processes for managing computing processes within a plurality of data centers configured to provide a cloud computing environment are described. An exemplary process includes executing a process on a first host of a plurality of hosts. When the process is executing on the first host, a first network identifier associated with the plurality of hosts is not a network identifier of a pool of network identifiers associated with the cloud computing environment and first and second route tables respectively corresponding to first and second data centers of the plurality of data centers associate the first network identifier with the first host. The exemplary process further includes detecting an event associated with the process. In response to detecting the event associated with the process, the first and second route tables are respectively updated to associate the first network identifier with a second host of the plurality of hosts.
Abstract:
Disclosed are aspects of proactive high availability that proactively identify and predict hardware failure scenarios and migrate virtual resources to healthy hardware resources. In some aspects, a mapping that maps virtual resources to hardware resources. A plurality of hardware events are identified in association with a hardware resource. A hardware failure scenario is predicted based on a health score of a first hardware resource. A health score is determined based on the hardware events, and a fault model that indicates a level of severity of the hardware events. A particular virtual resource is migrated from the hardware resource to another hardware that has a greater health score.
Abstract:
Techniques are disclosed for reallocating host resources in a virtualized computing environment when certain criteria have been met. In some embodiments, a system identifies a host disabling event. In view of the disabling event, the system identifies a resource for reallocation from a first host to a second host. Based on the identification, the computer system disassociates the identified resource's virtual identifier from the first host device and associates the virtual identifier with the second host device. Thus, the techniques disclosed significantly reduce a system's planned and unplanned downtime.