Abstract:
Failover methods and systems for a networked storage environment are provided. A metadata data structure is generated, before starting a replay of entries at a log stored in a non-volatile memory of a second storage node, during a failover operation initiated in response to a failure at a first storage node. The second storage node operates as a partner node of the first storage node, and the metadata structure stores a metadata attribute of each log entry. Furthermore, the metadata attribute of each log entry is persistently stored. The persistently stored metadata attribute is used to respond to a read request received during the replay by the second storage node, while a write request metadata attribute of a write request is used for executing the write request received by the second storage node during the replay.
Abstract:
A system and method for handling multi-node failures in a disaster recovery cluster is provided. In the event of an error condition, a switchover operation occurs from the failed nodes to one or more surviving nodes. Data stored in non-volatile random access memory is recovered by the surviving nodes to bring storage objects, e.g., disks, aggregates and/or volumes into a consistent state.
Abstract:
Techniques for detecting data loss during site switchover are disclosed. An example method includes storing at NVRAM of a first node a plurality of operations of a second node, the first and second nodes being disaster recovery partners. The method also includes during a switchover from the second node to the first node, receiving an indication of a first number of operations yet to be completed. The method further includes comparing the first number to a second number of operations in the plurality of operations stored at the NVRAM of the first node. The method also includes in response to the comparing, determining whether at least one operation is missing from the plurality of operations stored in the NVRAM of the first node. The method further includes in response to determining that at least one operation is missing, failing at least one volume.
Abstract:
A system and method for handling multi-node failures in a disaster recovery cluster is provided. In the event of an error condition, a switchover operation occurs from the failed nodes to one or more surviving nodes. Data stored in non-volatile random access memory is recovered by the surviving nodes to bring storage objects, e.g., disks, aggregates and/or volumes into a consistent state.
Abstract:
Methods and systems for co-locating journaling and data storage are provided. Separate journal and volume partitions may be maintained within each logical storage unit (e.g., Logical Unit Number (LUN)) of a distributed storage system. Journaling of metadata associated with write requests received from one or more clients may be distributed by identifying a destination logical storage unit to which data associated with a given write request is to be stored and causing the data and metadata to be persisted to disk by journaling the metadata and the data to respective portions of an active log within the journal partition of the destination logical storage unit. By using the same logical storage unit for both journaling of write requests and writing the data associated with such write requests, the bottleneck due to there being only a single device or storage unit handling all metadata for all write requests can be avoided.
Abstract:
A data management system can include a set of storage media configured to implement a storage space and a set of controllers. The set of controllers can be configured to write to the storage space and to implement a set of nodes. The set of controllers can include a first controller that implements a first node and includes a first persistent memory, a second controller that implements a second node and includes a second persistent memory and a third controller that implements a third node and includes a third persistent memory. The third node can be configured to write third node journal data to the first persistent memory. The first node can be configured to generate first node journal data based on a first request received from a backend, write the first node journal data to the first persistent memory, and replicate the journal data to the second persistent memory.
Abstract:
Systems and methods are for improving latency and throughput for metadata-heavy workloads and/or workloads including metadata bursts by decoupling data journal records and metadata-only journal records are provided. According to one embodiment, expedited and independent space reclamation is facilitated by differentiating between various types of journal records chains of which should be retained until different conditions are met. For example, data journal records may be added to data journal record chains within a persistent KV store and metadata-only journal records may be added to metadata-only journal record chains within the persistent KV store. Reclamation of spaced utilized by a data journal record chain may be reclaimed after both remote node data flush has been completed and the completion of a local CP for all records in the chain, whereas records of a metadata-only journal chain may be freed independently upon completion of a local CP for all records.
Abstract:
Failover methods and systems for a networked storage environment are provided. In one aspect, a read request associated with a first storage object is received, during a replay of entries of a log stored in a non-volatile memory of a second storage node for a failover operation initiated in response to a failure at a first storage node. The second storage node operates as a partner node of the first storage node. The read request is processed using a filtering data structure that is generated from the log prior to the replay and identifies each log entry. The read request is processed when the log does not have an entry associated with the read request, and when the filtering data structure includes an entry associated with the read request, the requested data is located at the non-volatile memory.
Abstract:
Failover methods and systems for a networked storage environment are provided. In one aspect, a read request associated with a first storage object is received, during a replay of entries of a log stored in a non-volatile memory of a second storage node for a failover operation initiated in response to a failure at a first storage node. The second storage node operates as a partner node of the first storage node. The read request is processed using a filtering data structure that is generated from the log prior to the replay and identifies each log entry. The read request is processed when the log does not have an entry associated with the read request, and when the filtering data structure includes an entry associated with the read request, the requested data is located at the non-volatile memory.
Abstract:
Failover methods and systems for a networked storage environment are provided. In one aspect, a read request associated with a first storage object is received, during a replay of entries of a log stored in a non-volatile memory of a second storage node for a failover operation initiated in response to a failure at a first storage node. The second storage node operates as a partner node of the first storage node. The read request is processed using a filtering data structure that is generated from the log prior to the replay and identifies each log entry. The read request is processed when the log does not have an entry associated with the read request, and when the filtering data structure includes an entry associated with the read request, the requested data is located at the non-volatile memory.