Abstract:
A snap restore technique efficiently restores snapshots of storage containers served by a storage input/output (I/O) stack executing on one or more nodes of a cluster. A Small Computer Systems Interface administration layer interacts with a volume layer of the storage I/O stack to manage and implement a snap restore procedure to restore one or more snapshots of a storage container. The storage container may be a logical unit (LUN) embodied as parent volume (active volume) and the snapshot may be represented as an independent volume embodied as read-only copy of the active volume. The snap restore procedure may be configured to allow restoration to a single snapshot of a LUN or restoration of a plurality of LUNs organized as a consistency group from a group of snapshots. Restoration of the LUN from a snapshot involves (i) creation of another independent volume embodied as a read-write copy (clone) of the snapshot, (ii) replacement of the (old) active volume with the clone, (iii) deletion of the old active volume, and (iv) mapping of the LUN to the clone (i.e., a new active volume).
Abstract:
Intelligent snapshot tiering facilitates efficient management of snapshots and efficient restore of snapshots. For intelligent snapshot tiering, a storage appliance can limit cross-tier migration to invalidated data blocks of a snapshot instead of an entire snapshot. Based on a policy, a storage appliance can identify a snapshot to be migrated to another storage tier and then determine which data blocks are invalidated by an immediately succeeding snapshot. This would limit network bandwidth consumption to the invalidated data blocks and maintain the valid data blocks at the faster access storage tier since the more recent snapshots are more likely to be restored.
Abstract:
One or more techniques and/or computing devices are provided for cross-platform replication. For example, a replication relationship may be established between a first storage endpoint and a second storage endpoint, where at least one of the storage endpoints, such as the first storage endpoint, lacks or has incompatible functionality to perform and manage replication because the storage endpoints have different storage platforms that store data differently, use different control operations and interfaces, etc. Accordingly, replication destination workflow, replication source workflow, and/or a proxy representing the first storage endpoint may be implemented at the second storage endpoint comprising the replication functionality. In this way, replication, such as snapshot replication, may be implemented between the storage endpoints by the second storage endpoint using the replication destination workflow, the replication source workflow, and/or the proxy that either locally executes tasks or routes tasks to the first storage endpoint such as for data access.
Abstract:
A N-way merge technique efficiently updates metadata in accordance with a N-way merge operation managed by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. The metadata is embodied as mappings from logical block addresses (LBAs) of a logical unit (LUN) accessible by a host to durable extent keys, and is organized as a multi-level dense tree. The mappings are organized such that a higher level of the dense tree contains more recent mappings than a next lower level, i.e., the level immediately below. The N-way merge operation is an efficient (i.e., optimized) way of updating the volume metadata mappings of the dense tree by merging the mapping content of all three levels in a single iteration, as opposed to merging the content of the first level with the content of the second level in a first iteration of a two-way merge operation and then merging the results of the first iteration with the content of the third level in a second iteration of the operation.
Abstract:
A technique restores a file system of a storage input/output (I/O) stack to a deterministic point-in-time state in the event of failure (loss) of non-volatile random access memory (NVRAM) of a node. The technique enables restoration of the file system to a safepoint stored on storage devices, such solid state drives (SSD), of the node with minimum data and metadata loss. The safepoint is a point-in-time during execution of I/O requests (e.g., write operations) at which data and related metadata of the write operations prior to the point-in-time are safely persisted on SSD such that the metadata relating to an image of the file system on SSD (on-disk) is consistent and complete. Upon reboot after NVRAM loss, the technique identifies (i) the most recent safepoint, as well as (ii) the inflight writes that were persistently stored on disk after the most recent safepoint. The data and metadata of those inflight writes are then deleted to place the on-disk file system to its state at the most recent safepoint.
Abstract:
A system can maintain multiple queues for deduplication requests of different priorities. The system can also designate priority of storage units. The scheduling priority of a deduplication request is based on the priority of the storage unit indicated in the deduplication request and a trigger for the deduplication request.
Abstract:
A snap restore technique efficiently restores snapshots of storage containers served by a storage input/output (I/O) stack executing on one or more nodes of a cluster. A Small Computer Systems Interface administration layer interacts with a volume layer of the storage I/O stack to manage and implement a snap restore procedure to restore one or more snapshots of a storage container. The storage container may be a logical unit (LUN) embodied as parent volume (active volume) and the snapshot may be represented as an independent volume embodied as read-only copy of the active volume. The snap restore procedure may be configured to allow restoration to a single snapshot of a LUN or restoration of a plurality of LUNs organized as a consistency group from a group of snapshots. Restoration of the LUN from a snapshot involves (i) creation of another independent volume embodied as a read-write copy (clone) of the snapshot, (ii) replacement of the (old) active volume with the clone, (iii) deletion of the old active volume, and (iv) mapping of the LUN to the clone (i.e., a new active volume).
Abstract:
Embodiments herein are directed to efficient crash recovery of persistent metadata managed by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. Volume metadata managed by the volume layer is organized as a multi-level dense tree, wherein each level of the dense tree includes volume metadata entries for storing the volume metadata. When a level of the dense tree is full, the volume metadata entries of the level are merged with the next lower level of the dense tree. During a merge operation, two sets of generation IDs may be used in accordance with a double buffer arrangement: a first generation ID for the append buffer that is full (i.e., a merge staging buffer) and a second, incremented generation ID for the append buffer that accepts new volume metadata entries. Upon completion of the merge operation, the lower level (e.g., level 1) to which the merge is directed is assigned the generation ID of the merge staging buffer.
Abstract:
Embodiments herein are directed to efficient crash recovery of persistent metadata managed by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. Volume metadata managed by the volume layer is organized as a multi-level dense tree, wherein each level of the dense tree includes volume metadata entries for storing the volume metadata. When a level of the dense tree is full, the volume metadata entries of the level are merged with the next lower level of the dense tree. During a merge operation, two sets of generation IDs may be used in accordance with a double buffer arrangement: a first generation ID for the append buffer that is full (i.e., a merge staging buffer) and a second, incremented generation ID for the append buffer that accepts new volume metadata entries. Upon completion of the merge operation, the lower level (e.g., level 1) to which the merge is directed is assigned the generation ID of the merge staging buffer.
Abstract:
In one embodiment, a node coupled to one or more storage devices executes a storage input/output (I/O) stack having a volume layer, a persistence layer and an administration layer that interact to create a copy of a parent volume associated with a storage container on the one or more storage devices. A copy create start message is received at the persistence layer from the administration layer. The persistence layer ensures that dirty data for the parent volume is incorporated into the copy of the parent volume. New data for the parent volume received at the persistence layer during creation of the copy of the parent volume is prevented from incorporation into the copy of the parent volume. A reply to the copy create start message is sent from the persistence layer to the administration layer to initiate the creation of the copy of the parent volume at the volume layer.