Abstract:
Embodiments of the present disclosure disclose a solution for data backup and recovery in a storage system. When a source device in the storage system backs up, to a backup-end device, a data block that is written after a snapshot Sn, the source device performs a logical operation such as an exclusive-NOR or exclusive-OR operation on the written data block and an original data block, which is recorded in the snapshot Sn, of the written data block, and then compresses a data block obtained after the logical operation, which improves a compression ratio of a data block, thereby reducing an amount of data that is sent to the backup-end device, and saving transmission bandwidth. The solution may be further applied to a scenario of data recovery in a storage system.
Abstract:
A data storage method includes: separately dividing a data block and a reference data block into N equal-sized sub-data blocks, comparing a sub-data block and a reference sub-data block corresponding to a same location identifier, determining, in the N sub-data blocks, a sub-data block that can be deduplicated and a sub-data block that cannot be deduplicated, performing a deduplication operation on the sub-data block that can be deduplicated, selecting a representative sub-data block of the sub-data block that cannot be deduplicated, performing an exclusive OR operation on data of the sub-data block that cannot be deduplicated and data of the representative sub-data block, compressing a result of the exclusive OR operation using run-length encoding, and storing a compression result and location information of the sub-data block that cannot be deduplicated.
Abstract:
The present disclosure directs to solutions for performing deduplication by a storage device. In the solutions, according to a duplicate data locality principle, non-duplicate data blocks whose logical addresses are contiguous are stored in contiguous physical addresses in a sequence of the logical addresses, and fingerprints of the non-duplicate data blocks whose logical addresses are contiguous are also stored in contiguous physical addresses in the sequence of the logical addresses, and in addition, a mapping from a logical address, which is of one data block in the non-duplicate data blocks whose logical addresses are contiguous, to an aggregation address is established.
Abstract:
Embodiments of the present invention provide a method for searching for a data stream dividing point based on a server. In the embodiments of the present invention, a data stream dividing point is searched for by determining whether at least a part of data in a window of M windows meets a preset condition, and when the at least a part of data in the window does not meet the preset condition, a length of N*U is skipped, so as to obtain a next potential dividing point, thereby improving efficiency of searching for a data stream dividing point.
Abstract:
Embodiments of the present invention provide a method for searching for a data stream dividing point based on a server. In the embodiments of the present invention, a data stream dividing point is searched for by determining whether at least a part of data in a window of M windows meets a preset condition, and when the at least a part of data in the window does not meet the preset condition, a length of N*U is skipped, so as to obtain a next potential dividing point, thereby improving efficiency of searching for a data stream dividing point.
Abstract:
Embodiments of the present invention provide a metadata querying method and apparatus. The method includes: sampling at least one piece of first metadata from to-be-searched-for metadata; using at least a part of feature values in each piece of the sampled first metadata as an index, and searching a sparse index table preset in a memory for a corresponding container identifier; selecting, according to the number of times that a same container identifier is found, a container corresponding to a container identifier that meets a set condition; loading metadata in the selected container into a metadata cache; and searching the metadata cache for a data block that is the same as the to-be-searched-for metadata. In the embodiments of the present invention, querying performance can be improved and occupied memory space can be reduced.