摘要:
Various embodiments of systems and methods are disclosed for initially synchronizing redundant data (e.g., a mirror, a replica, or a set of parity information) with an original volume. State information identifies which regions of the original volume are currently valid, and only valid regions of the original volume are used to generate the values of the redundant data during the initial synchronization. For example, if the redundant data is a set of parity information, synchronizing the redundant data involves calculating one or more parity values based on the valid regions of the volume. If the redundant data is a duplicate copy (e.g., a mirror or replica) of the volume, synchronizing the redundant data involves copying the valid regions of the volume to the duplicate copy of the volume. If the original volume includes any invalid regions, unnecessary copying and/or processing for those regions can be avoided during the initial synchronization.
摘要:
A virtual copy of data stored in a first memory is created in a second memory. Creating the virtual copy includes, in one embodiment, creating first and second tables in memory each one of which comprises a plurality of multibit entries. Each entry of the first table corresponds to a respective memory region of the first memory. Each entry of the second table corresponds to a respective memory region of the second memory. The first bit of the first and second tables indicates whether the corresponding memory region of the first and second memories, respectively, contains valid data. The second bit of the first and second tables indicates whether data in the corresponding memory region of the first and second memories, respectively, has been modified since the creation of the first and second tables, respectively.
摘要:
Disclosed is a method and apparatus for restoring a corrupted data volume. In one embodiment, the method includes creating a backup copy of the data volume before the data volume is corrupted. Data transactions that modify the contents of the data volume are stored in a transaction log. After detection of the data corruption, a virtual copy of the backup copy is created. Thereafter, select data transactions stored in the transaction log, are applied to the virtual copy. Data of the corrupted data volume is then overwritten with data of the backup copy and data of the virtual copy after applying data transactions to the virtual copy.
摘要:
Applications executing on various nodes in a distributed storage environment may write data to primary storage and may also replicate the data to secondary storage via a replication target. An interval coordinator may coordinate the periodic saving of checkpoints or snapshots of the replicated data. The interval coordinator may determine the length of consistency intervals between the saving of each of the checkpoints. Writes to the replication target from each of the nodes may be associated with the current consistency interval and, in some embodiments, with a unique per-node sequence number. When transitioning between consistency intervals, each node may be configured to temporarily suspend completion of the writes and to send the replication target a consistency interval marker indicating that the node has completed all writes for the current consistency interval.
摘要:
A method, system, computer system and computer program product to synchronize data and a snapshot of the data taken at a given point in time. Persistent data change maps are used to track changes made to data after a snapshot of the data is taken. Changes to the data are tracked using a persistent accumulator map, and changes to the data with respect to a second set of data are tracked using a persistent volume map. The persistent accumulator map is updated with each update of the data. Persistent volume maps are updated when a subsequent snapshot of the data is taken. Only changes to the data made after the snapshot was taken are applied to synchronize the snapshot with the data so that all of the data is not copied. Snapshots can be located in a physically separate location from the data itself.
摘要:
One goal of consistency interval replication is to achieve a consistent copy of data generated by independent streams of writes from nodes in a clustered/distributed environment. Two writes to the same block from different nodes may arrive at a replication target in a different order from the order in which they were written to primary storage. A consistency interval coordinator may analyze a list of blocks modified during a consistency interval to determine conflict blocks written to by two different nodes during the same consistency interval. Conflict resolution may involve a node reading data for a conflict block from primary storage and forwarding it to the replication target or a node completing a suspended in-progress write for the conflict block. Once the conflicts have been resolved, the replication target may checkpoint the data modified during the interval and nodes may resume writes to the conflict blocks for the new interval.
摘要:
Various embodiments of systems and methods are disclosed for tracking valid regions of a working volume. State information identifies which regions of the working volume are currently valid. When the volume is created, the state information can be initialized to a value that identifies all regions of the volume as being invalid. The invalid regions do not need to be synchronized, since there will not be any need to reconstruct the data within those regions to a particular value. Accordingly, volume initialization, which synchronizes redundant data (e.g., RAID parity or a mirrored copy) with application data in the invalid regions, can be delayed. As the volume is accessed by an application, the redundant data associated with the regions being accessed is synchronized, and the state information is updated to indicate that those regions are valid.
摘要:
A virtual copy of data stored in a first memory is created in a second memory. Creating the virtual copy includes, in one embodiment, creating first and second tables in memory each one of which comprises a plurality of multibit entries. Each entry of the first table corresponds to a respective memory region of the first memory. Each entry of the second table corresponds to a respective memory region of the second memory. The first bit of the first and second tables indicates whether the corresponding memory region of the first and second memories, respectively, contains valid data. The second bit of the first and second tables indicates whether data in the corresponding memory region of the first and second memories, respectively, has been modified since the creation of the first and second tables, respectively.
摘要:
Disclosed is an apparatus and method for transforming unrelated data volumes into related data volumes. The present invention is employed after creation of first and second unrelated data volumes. In one embodiment, the second data volume is refreshed to the data contents of the first data volume so that the second data becomes a PIT copy of the first data volume. Refreshing the second data volume includes overwriting all data of the second data volume with data copied from the first data volume. However, before all data of the second data volume is overwritten with data copied from the first data volume, data of the first data volume can be modified.
摘要:
Disclosed is an apparatus or method performed by a computer system for creating a hierarchy of data volumes. Each data volume in the hierarchy is a point-in-time (PIT) copy of another data volume in the hierarchy or a PIT copy of a data volume V. In one embodiment of the apparatus or method, the contents of a first data volume in the hierarchy can be refreshed to the contents of a second data volume in the hierarchy such that the first data volume becomes a PIT copy of the second data volume. Before the first data volume is fully refreshed to the contents of the second data volume, data of the first data volume can be read or modified.