Abstract:
A virtualized computer system provides fault tolerant operation of a primary virtual machine. In one embodiment, this system includes a backup computer system that stores a snapshot of the primary virtual machine and a log file containing non-deterministic events occurring in the instruction stream of the primary virtual machine. The primary virtual machine periodically updates the snapshot and the log file. Upon a failure of the primary virtual machine, the backup computer can instantiate a failover backup virtual machine by consuming the stored snapshot and log file.
Abstract:
System and method for providing fault tolerance in virtualized computer systems use a first guest and a second guest running on virtualization software to produce outputs, which are produced when a workload is executed on the first and second guests. An output of the second guest is compared with an output of the first guest to determine if there is an output match. If there is no output match, the first guest is paused and a resynchronization of the second guest is executed to restore a checkpointed state of the first guest on the second guest. After the resynchronization of the second guest, the paused first guest is caused to resume operation.
Abstract:
In a computer system running at least a first virtual machine (VM) and a second VM on virtualization software, a computer implemented method for the second VM to provide quasi-lockstep fault tolerance for the first VM includes executing a workload on the first VM and the second VM that involves producing at least one externally visible output and comparing an externally visible output of the second VM with an externally visible output of the first VM to determine if there is an output match. In response to a determination that the externally visible output of the second VM does not match the externally visible output of the first VM, a resynchronization of the second VM is executed. The externally visible output of the first VM is kept from being output externally until completion of the resynchronization.
Abstract:
System and method for providing fault tolerance in virtualized computer systems use a first guest and a second guest running on virtualization software to produce outputs, which are produced when a workload is executed on the first and second guests. An output of the second guest is compared with an output of the first guest to determine if there is an output match. If there is no output match, the first guest is paused and a resynchronization of the second guest is executed to restore a checkpointed state of the first guest on the second guest. After the resynchronization of the second guest, the paused first guest is caused to resume operation.
Abstract:
In a computer system running at least a first virtual machine (VM) and a second VM on virtualization software, a computer implemented method for the second VM to provide quasi-lockstep fault tolerance for the first VM includes executing a workload on the first VM and the second VM that involves producing at least one externally visible output and comparing an externally visible output of the second VM with an externally visible output of the first VM to determine if there is an output match. In response to a determination that the externally visible output of the second VM does not match the externally visible output of the first VM, a resynchronization of the second VM is executed. The externally visible output of the first VM is kept from being output externally until completion of the resynchronization.
Abstract:
The output of a non-deterministic instruction is handled during record and replay in a virtual machine. An output of a non-deterministic instruction is stored to a buffer during record mode and retrieved from a buffer during replay mode without exiting to the hypervisor. At least part of the contents of the buffer can be stored to a log when the buffer is full during record mode, and the buffer can be replenished from a log when the buffer is empty during replay mode.
Abstract:
The output of a non-deterministic instruction is handled during record and replay in a virtual machine. An output of a non-deterministic instruction is stored to a buffer during record mode and retrieved from a buffer during replay mode without exiting to the hypervisor. At least part of the contents of the buffer can be stored to a log when the buffer is full during record mode, and the buffer can be replenished from a log when the buffer is empty during replay mode.
Abstract:
The translation lookaside buffer (TLB) of a processor is kept in synchronization with a guest page table by use of an indicator referred to as a “T” bit. The T bit of the NPT/EPT entries mapping the guest page table are set when a page walk is performed on the NPT/EPT. When modifications are made to pages mapped by NPT/EPT entries with their T bit set, changes to the TLB are made so that the TLB remains in synchronization with the guest page table. Accordingly, record/replay of virtual machines of virtualized computer systems may be performed reliably with no non-determinism introduced by stale TLBs that fall out of synchronization with the guest page table.
Abstract:
The translation lookaside buffer (TLB) of a processor is kept in synchronization with a guest page table by use of an indicator referred to as a “T” bit. The T bit of the NPT/EPT entries mapping the guest page table are set when a page walk is performed on the NPT/EPT. When modifications are made to pages mapped by NPT/EPT entries with their T bit set, changes to the TLB are made so that the TLB remains in synchronization with the guest page table. Accordingly, record/replay of virtual machines of virtualized computer systems may be performed reliably with no non-determinism introduced by stale TLBs that fall out of synchronization with the guest page table.
Abstract:
A virtualized computer system provides fault tolerant operation of a primary virtual machine. In one embodiment, this system includes a backup computer system that stores a snapshot of the primary virtual machine and a log file containing non-deterministic events occurring in the instruction stream of the primary virtual machine. The primary virtual machine periodically updates the snapshot and the log file. Upon a failure of the primary virtual machine, the backup computer can instantiate a failover backup virtual machine by consuming the stored snapshot and log file.