Abstract:
An offset range striping technique increases concurrency of operation execution directed to metadata managed by a volume layer of a storage input/output (I/O) stack, while reducing contention among resources of one or more nodes of a cluster. A logical unit (LUN) may be apportioned into multiple volumes, each of which may be partitioned into multiple regions, wherein each region is represented by a dense tree. The technique increases concurrency of operation execution (e.g., modifications to the metadata at the offset ranges), while reducing contention among the resources (e.g., CPUs and NVLogs) by distributing the offset range operations among the regions and mapping the regions to services and NVLogs. Such increased concurrency and reduction of contention may be achieved by implementation of the technique to (i) apportion each region into disjoint chunks (i.e., stripes) of contiguous offset ranges; (ii) organize a plurality of regions into one or more zones and populate a first zone before allocating a second zone; and (iii) stagger the mapping of services to starting regions of the volumes.
Abstract:
Embodiments herein are directed to efficient crash recovery of persistent metadata managed by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. Volume metadata managed by the volume layer is organized as a multi-level dense tree, wherein each level of the dense tree includes volume metadata entries for storing the volume metadata. When a level of the dense tree is full, the volume metadata entries of the level are merged with the next lower level of the dense tree. During a merge operation, two sets of generation IDs may be used in accordance with a double buffer arrangement: a first generation ID for the append buffer that is full (i.e., a merge staging buffer) and a second, incremented generation ID for the append buffer that accepts new volume metadata entries. Upon completion of the merge operation, the lower level (e.g., level 1) to which the merge is directed is assigned the generation ID of the merge staging buffer.
Abstract:
The embodiments described herein are directed to efficient merging of metadata managed by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. The metadata managed by the volume layer, i.e., the volume metadata, is illustratively organized as a multi-level dense tree metadata structure, wherein each level of the dense tree metadata structure (dense tree) includes volume metadata entries for storing the volume metadata. The volume metadata entries of an upper level of the dense tree metadata structure are merged with the volume metadata entries of a next lower level of the dense tree metadata structure when the upper level is full. The volume metadata entries of the merged levels are organized as metadata pages and stored as one or more files on the SSDs.
Abstract:
In one embodiment, a node coupled to one or more storage devices executes a storage input/output (I/O) stack having a volume layer that manages volume metadata. The volume metadata is organized as one or more dense tree metadata structures having a top level residing in memory and lower levels residing on the one or more storage devices. The dense tree metadata structures include a first dense tree metadata structure associated with a parent volume and a second dense tree metadata structure associated with a copy of the parent volume. The top level of the first dense tree metadata structure may be copied to the second dense tree metadata structure. The lower levels of the first dense tree metadata structure are initially shared with the second dense tree metadata structure. The shared lower levels may eventually be split as the parent volume diverges from the copy of the parent volume.
Abstract:
In one embodiment, snapshots and/or clones of storage objects are created and managed by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. Illustratively, the snapshots and clones may be represented as independent volumes, and embodied as respective read-only copies (snapshots) and read-write copies (clones) of a parent volume. Volume metadata is illustratively organized as one or more multi-level dense tree metadata structures, wherein each level of the dense tree metadata structure (dense tree) includes volume metadata entries for storing the metadata. Each snapshot/clone may be derived from a dense tree of the parent volume (parent dense tree). Portions of the parent dense tree may be shared with the snapshot/clone.
Abstract:
The embodiments described herein are directed to efficient logging and checkpointing of metadata managed by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. The metadata managed by the volume layer, i.e., the volume metadata, is illustratively organized as a multi-level dense tree metadata structure, wherein each level of the dense tree metadata structure (dense tree) includes volume metadata entries for storing the volume metadata. Each volume metadata entry may be a descriptor that embodies one of a plurality of types, including a data entry and an index entry, and a hole (i.e., absence of data) entry.
Abstract:
In one embodiment, a node coupled to one or more storage devices executes a storage input/output (I/O) stack having a volume layer. The volume layer manages volume metadata embodied as mappings from offsets of a logical unit (LUN) to extent keys associated with storage locations for extents on the one or more storage devices. Volume metadata is maintained as a dense tree metadata structure representing successive points in time. The dense tree metadata structure has multiple levels, wherein a top level of the dense tree metadata structure represents newer volume metadata changes and descending levels of the dense tree metadata structure represent older volume metadata changes. The node accesses a latest version of changes to the volume metadata by searching from the top level to the descending levels in the dense tree metadata structure.
Abstract:
Systems, methods, and machine-readable media for creating, deleting, and restoring volume snapshots in a remote data store are disclosed. A storage volume and a storage operating system are implemented in a software container. Through a user interface, a user may create a snapshot of the volume to a cloud storage. A user may also delete individual snapshots from the cloud storage. Further, deletion of a most recent snapshot may occur by awaiting deletion (though marking as deleted to the user) until a next snapshot is received. Snapshots in the cloud storage are manipulatable even after destruction of the source volume (by destruction of the container, for example). A controller outside the container is used by implementing the same API as the controller in the container had. Full restores of snapshots in the cloud are also possible even when the original container and volume have been destroyed.
Abstract:
Techniques are provided for orchestrating operations between a storage environment and a computing environment hosting virtual machines. A virtual machine proxy, associated with a computing environment hosting a virtual machine, is accessed by an orchestrator to identify the virtual machine and properties of the virtual machine. A storage proxy, associated with a storage environment comprising a volume within which snapshots of the virtual machine are to be stored, is accessed by the orchestrator to initialize a backup procedure. The orchestrator utilizes the virtual machine proxy to create a snapshot of the virtual machine. The orchestrator utilizes the storage proxy to back up the snapshot to the volume using the backup procedure.
Abstract:
Systems and methods for creation and retention of immutable snapshots to facilitate ransomware protection are provided. According to one embodiment, multiple use cases for retention of snapshots are supported, including (i) maintaining a locked snapshot on a source volume of a first storage system on which it was originally created for at least an associated immutable retention time; (ii) replicating the locked snapshot to a destination volume of a second storage system and also maintaining the replica of the locked snapshot on the destination volume for at least the associated immutable retention time; and (iii) maintaining an unlocked snapshot on the source volume, replicating the unlocked snapshot to the destination volume, locking the replicated snapshot on the destination volume when it has an associated non-zero immutable retention time, and thereafter maintaining the replica on the destination volume in accordance with the immutable retention time.