Abstract:
Data chunks encrypted using an encryption key are backed up to a server. Each chunk is associated with plain and encryption signatures. The plain signature is based on an unencrypted version of a chunk. The encryption signature is based on an encrypted version of the chunk. A new data chunk is identified and a new plain signature for the new chunk is calculated. A request is made for a current key and the new chunk is encrypted using the current key to obtain a new encryption signature. The new encryption and plain signatures are sent to the server for comparison against the existing encryption and plain signatures. If the new encryption signature does not match an encryption signature of an existing chunk and the new plain signature matches a plain signature of the existing chunk, the new chunk is transmitted to the server to replace the existing chunk.
Abstract:
Techniques for determining optimal time window for data movement from a source storage system to a target storage system are described herein. According to one embodiment, statistics data is received representing historic performance statistics over a predetermined period of time by a source storage system, where the historic performance statistics include resource consumption of a plurality of resources including at least one of a processor, memory, input-output (IO) transactions, and network bandwidth. An analysis is performed by an analysis module executed by a processor on the historic performance statistics to determine an optimal time window within the predetermined time period for data movement from the source storage system to a target storage system based on the analysis. A scheduler executed by the processor is to schedule the data movement from the source storage system to the target storage system according to the optimal time window.
Abstract:
Systems and methods are described for backing up files and directories using a common backup format. The files and directories may be represented by objects within a data stream constructed using the common backup format. The data streams may be traversed and updated using a size tree such that modifications are made to individual objects within the data streams without complete traversal. This process results in efficient management of storage systems as read and write operations are not dependent on exhaustive traversal of data streams.
Abstract:
Techniques for deduplicating a data stream based on boundary markers embedded therein are described. According to one embodiment, a data stream is received from a client having a sequence of a plurality of data objects, where to data stream represents a file or a directory of one or more files of a file system associated with the client. In response, the data stream is deduplicated into a plurality of deduplicated chunks in view of boundaries of the data objects.
Abstract:
According to one embodiment, a first storage system receives a first data stream from a second storage system over a network. The first data stream includes data objects and differential object information identifying at least one data object missing from the first data stream. A difference between the first data stream and a second data stream that has been previously received is determined based on the differential object information, including identifying a data object that has been added, deleted, or modified in view of the second data stream. The first data stream is reconstructed based on the second data stream and the difference between the first data stream and the second data stream, generating a third data stream. The third data stream is stored in a persistent storage device of the first storage system, the third data stream representing a complete first data stream without a missing data object.
Abstract:
Systems and methods are described for backing up files and directories using a common backup format. The files and directories may be represented by objects within a data stream constructed using the common backup format. The data streams may be traversed and updated using a size tree such that modifications are made to individual objects within the data streams without complete traversal. This process results in efficient management of storage systems as read and write operations are not dependent on exhaustive traversal of data streams.
Abstract:
Systems and methods are described for backing up files and directories using a common backup format. The files and directories may be represented by objects within a data stream constructed using the common backup format. The data streams may be traversed and updated using a size tree such that modifications are made to individual objects within the data streams without complete traversal. This process results in efficient management of storage systems as read and write operations are not dependent on exhaustive traversal of data streams.
Abstract:
Techniques for deduplicating a data stream with checksum data embedded therein are described. According to one embodiment, a first data stream is received from a client having a plurality of data regions and a plurality of checksums for verifying integrity of the data regions embedded therein, where the first data stream represents a file or a directory of one or more files of a file system associated with the client. In response the first data stream with the checksums removed is deduplicated into a plurality of deduplicated chunks.