Abstract:
Solutions for managing archived storage include receiving, at a first node, a snapshot comprising object data (e.g., a virtual machine disk snapshot) from a second node (e.g., a software defined data center), and storing the snapshot in a tiered structure that includes a data tier and a metadata tier. Snapshots may be used for fail-over operations and/or backups, to support disaster recovery. The data tier comprises a log-structured file system (LFS), and the metadata tier comprises a content addressable storage (CAS) identifying addresses within the LFS. The metadata tier also comprises a logical layer indicating content in the CAS. Segment cleaning of the data tier is performed using a segment usage table (SUT). Some examples include performing a fail-over operation from the second node to a third node using at least the stored snapshot for workload recovery. In some examples, the CAS comprises a log-structured merge-tree (LSM-tree).
Abstract:
Exemplary methods, apparatuses, and systems include a recovery manager receiving selection of a storage profile to be protected. The storage profile is an abstraction of a set of one or more logical storage devices that are treated as a single entity based upon common storage capabilities. In response to the selection of the storage profile to be protected, a set of virtual datacenter entities associated with the storage profile is added to a disaster recovery plan to automate a failover of the set of virtual datacenter entities from a protection site to a recovery site. The set of one or more virtual datacenter entities includes one or more virtual machines, one or more logical storage devices, or a combination of virtual machines and logical storage devices. The set of virtual datacenter entities is expandable and interchangeable with other virtual datacenter entities.
Abstract:
Examples maintain consistency of writes for a plurality of VMs during live migration of the plurality from a source host to a destination host. The disclosure intercepts I/O writes to a migrated VM at a destination host and mirrors the I/O writes back to the source host. This “reverse replication” ensures that the CG of the source host is up to date, and that the source host is safe to fail back to if the migration fails.
Abstract:
Mapping computer resources to consumers in a computer system is described. In an example, a method of mapping computer resources to consumers in a computer system includes: receiving tags assigned to the computer resources at a resource manager executing in the computer system, where the resource manager: identifies a first tag assigned to a first computer resource; determines whether a first consumer is associated with the first tag; enables the first consumer to access the first computer resource if the first consumer is associated with the first tag; and prevents the first consumer from accessing the first computer resource if the first consumer is not associated with the first tag.
Abstract:
The efficiency of segment cleaning for a log-structured file system (LFS) is enhanced at least by storing additional information in a segment usage table (SUT). Live blocks (representing portions of stored objects) in an LFS are determined based at least on the SUT. Chunk identifiers associated with the live blocks are read. The live blocks are coalesced at least by writing at least a portion of the live blocks into at least one new segment. A blind update of at least a portion of the chunk identifiers in a chunk map is performed to indicate the new segment. The blind update includes writing to the chunk map without reading from the chunk map. In some examples, the objects comprise virtual machine disks (VMDKs) and the SUT changes between a list format and a bitmap format, to minimize size.
Abstract:
Examples perform live migration of VMs from a source host to a destination host using destructive consistency breaking operations. The disclosure makes a record of a consistency group of VMs on storage at a source host as a fail-back in the event of failure. The source VMs are live migrated to the destination host, disregarding consistency during live migration, and potentially violating the recovery point objective. After live migration of all of the source VMs, consistency is automatically restored at the destination host and the live migration is declared a success.
Abstract:
Examples perform monitoring of multiple-step, concurrently executed workflows across distributed nodes. Requests from an intermediate node are classified by a load balancer as monitoring or non-monitoring. Non-monitoring requests are handled by any node; however, monitoring requests are distributed to all nodes via a plurality of queues but handled only by nodes executing the subject workflow. The load balancer receives reports from any node executing the subject workflow, and passes the first report to the intermediate node.
Abstract:
A distributed system and method for error handling testing of a target component in the distributed system uses a proxy gateway in the target component that can intercept communications to and from remote components of the distributed system. When a proxy mode of the proxy gateway in the target component is enabled, at least one of the communications at the proxy gateway is modified to introduce an error. When the proxy mode of the proxy gateway in the target component is disabled, the communications to and from the remote components of the distributed system are transmitted via the proxy gateway without modification.
Abstract:
Techniques to process virtual machine objects through multistep workflows in a computer system are described. In an example, a method of processing virtual machine objects through a workflow having a plurality of ordered steps in a computer system includes executing the workflow on computing resources of the computer system using the virtual machine objects as parametric input, where the computing resources: divide the virtual machine objects into workgroups; perform instances of a step of the workflow in parallel on the workgroups as the workgroups complete a prior step in the workflow; and execute an agent to delegate the workgroups to, and receive results from, the instances of the step as the workflow is executed.
Abstract:
Virtual computing instance data that are stored across multiple storage volumes are replicated in a manner such that the write order is maintained. The frequency of the replication is set so that the recovery point objective defined for the VM data can be satisfied. The replication includes the steps of determining a set of logical storage volumes across which the virtual computing instance issues dependent write IOs, issuing a first command to the virtual computing instance to block new IOs and to block receipt of TO acknowledgements, issuing a command to create replicas of all the logical storage volumes in the set, and then issuing a second command to the virtual computing instance to unblock new IOs and unblock receipt of TO acknowledgements.