Abstract:
A system and method for handling multi-node failures in a disaster recovery cluster is provided. In the event of an error condition, a switchover operation occurs from the failed nodes to one or more surviving nodes. Data stored in non-volatile random access memory is recovered by the surviving nodes to bring storage objects, e.g., disks, aggregates and/or volumes into a consistent state.
Abstract:
A system and method for handling multi-node failures in a disaster recovery cluster is provided. In the event of an error condition, a switchover operation occurs from the failed nodes to one or more surviving nodes. Data stored in non-volatile random access memory is recovered by the surviving nodes to bring storage objects, e.g., disks, aggregates and/or volumes into a consistent state.
Abstract:
A system and method for avoiding object identifier collisions in a cluster environment is provided. Upon creation of the cluster, volume location databases negotiate ranges for data set identifiers (DSIDs) between a first site and a second site of the cluster. Any pre-existing objects are remapped into an object identifier range associated with the particular site hosting the object.
Abstract:
Systems and methods for providing for efficient switchover for a client in a storage network between the use of one or more a primary storage resources to one or more disaster recovery (DR) resources are provided herein. Embodiments may implement synchronization between such resources on a data plane and a control plane to allow for a transition between resources to be implemented in a manner that is minimally disruptive to a client. Moreover, embodiments may provide for processing resources which allow for switching a client between a primary storage resource to a secondary storage resource with minimal administrative interaction.
Abstract:
A technique maintains consistent throughput of processing of input/output (I/O) requests by a storage system when changing configuration of one or more Redundant Array of Independent Disks (RAID) groups of storage devices, such as disks, within the storage system. The configuration of a RAID group (i.e., RAID configuration) may be represented by RAID objects (e.g., reference-counted data structures) stored in a memory of the storage system. Illustratively, the RAID objects may be organized as a RAID configuration hierarchy including a top-level RAID object (e.g., RAID group data structure) that is linked (e.g., via one or more pointers) to one or more intermediate-level RAID objects (e.g., disk and segment data structures) which, in turn, are linked to one or more low-level RAID objects (e.g., chunk data structures). According to the technique, a snapshot of a current RAID configuration (i.e., current configuration snapshot) may be created by incrementing a reference count of the current top-level object of the hierarchy and attaching (e.g., via a pointer) the current configuration snapshot to a current I/O request processed by the storage system.
Abstract:
A cluster-wide consistency checker ensures that two file systems of a storage input/output (I/O) stack executing on each node of a cluster are self-consistent as well as consistent with respect to each other. The file systems include a deduplication file system and a host-facing file system that cooperate to provide a layered file system of the storage I/O stack. The deduplication file system is a log-structured file system managed by an extent store layer of the storage I/O stack, whereas the host-facing file system is managed by a volume layer of the stack. Illustratively, each log-structured file system implements a key-value store and cooperates with other nodes of the cluster to provide a cluster-wide (global) key-value store. The consistency checker verifies and/or fixes on-disk structures of the layered file system to ensure its consistency. To that end, the consistency checker may determine whether there are inconsistencies in the key-value store and, if so, reconciles those inconsistencies from a client (volume layer) perspective.
Abstract:
A technique efficiently configures a peered cluster storage environment. The configuration technique illustratively includes three phases: a discovery phase, a node setup phase and a cluster setup phase. The discovery phase may be employed to initiate discovery of nodes of a disaster recovery (DR) group through transmission of multicast advertisement packets by the nodes over interconnects, including a Fibre Channel (FC) fabric, to each other node of the group. In the node setup phase, each node of a cluster assigns its relationships to the nodes discovered and present in the FC fabric; illustratively, the assigned relationships include high availability (HA) partner, DR primary partner and DR auxiliary partner. In the cluster setup phase, the discovered nodes of the FC fabric are organized as the peered cluster storage environment (DR group) configured to service data in a highly reliable and available manner.
Abstract:
Systems and methods for providing for efficient switchover for a client in a storage network between the use of one or more a primary storage resources to one or more disaster recovery (DR) resources are provided herein. Embodiments may implement synchronization between such resources on a data plane and a control plane to allow for a transition between resources to be implemented in a manner that is minimally disruptive to a client. Moreover, embodiments may provide for processing resources which allow for switching a client between a primary storage resource to a secondary storage resource with minimal administrative interaction.
Abstract:
A cluster-wide consistency checker ensures that two file systems of a storage input/output (I/O) stack executing on each node of a cluster are self-consistent as well as consistent with respect to each other. The file systems include a deduplication file system and a host-facing file system that cooperate to provide a layered file system of the storage I/O stack. The deduplication file system is a log-structured file system managed by an extent store layer of the storage I/O stack, whereas the host-facing file system is managed by a volume layer of the stack. Illustratively, each log-structured file system implements a key-value store and cooperates with other nodes of the cluster to provide a cluster-wide (global) key-value store. The consistency checker verifies and/or fixes on-disk structures of the layered file system to ensure its consistency. To that end, the consistency checker may determine whether there are inconsistencies in the key-value store and, if so, reconciles those inconsistencies from a client (volume layer) perspective.
Abstract:
One or more techniques and/or systems are provided for load balancing between storage controllers. For example, a first storage controller and a second storage controller may be configured at a first storage site according to a high availability configuration, and may be configured as disaster recovery partners for a third storage controller and a fourth storage controller at a second storage site. If the first storage controller fails, the second storage controller provides failover operation for a first storage device. If a disaster occurs at the second storage site, the second storage controller provides switchover operation for a third storage device and a fourth storage device. Responsive to the first storage controller being restored, the third storage device may be reassigned from the second storage controller to the first storage controller for load balancing at the first storage site during disaster recovery of the second storage site.