摘要:
A method, a system and a computer program product for performing deduplicating data. A data stream having a plurality of data zones is received. One or more data storage locations in a plurality of data storage locations for deduplicating one or more zones in the plurality of zones is identified. Each data storage location stores its respective deduplicated data zones. A data storage location for deduplicating a first data zone is selected. The first data zone is deduplicated using the selected data storage location.
摘要:
Delta compression method, system and computer program product. Portions of source and target data files are hashed using a hashing function. A target data file is compared against the source data file to determine at least one delta difference between the files. A source data file hashing table is generated. The table includes hashed portions of the source and target data files stored in corresponding source file offset locations and corresponding target file offset locations, respectively. Portions of the source and target files are compared using corresponding source and target file offset locations. At least one common sequence of characters in the portions of the source and target files is determined based on the comparison. A patch file is generated based on the determined sequence of characters.
摘要:
A method, a system, and a computer program product for performing next level multi-level deduplication. A first zone stamp for a first data zone is generated and compared to a second zone stamp representing a second data zone, where the zones are first level data zones. The first and second data zones are deduplicated when the first zone stamp matches the second zone stamp. A second-level first zone stamp is selected when there is no match between first and second zone stamps. The second-level first zone stamp, representing a second-level first data zone in the first data zone, is compared to the second zone stamp and/or a second-level second zone stamp representing a second-level second data zone. The second-level first zone and one of the second data zone and the second-level second zone are deduplicated when the second-level first zone stamp matches one of the second zone stamp and the second-level second zone stamp.
摘要:
A system, a method, and a computer program product for adaptively management bandwidth of a deduplication system are disclosed. A bandwidth policy for replication of data from a first deduplication location to a second deduplication location is determined. The bandwidth policy allocates a predetermined bandwidth for the replication of data. The deduplication locations are communicatively coupled via a network. Using the determined bandwidth policy, data from the first deduplication location is replicated to the second deduplication location based on the allocated bandwidth.
摘要:
A system, a method and a computer program product for storing data, which include receiving a data stream having a plurality of transactions that include at least one portion of data, determining whether at least one portion of data within at least one transaction is substantially similar to at least another portion of data within at least one transaction, clustering together at least one portion of data and at least another portion of data within at least one transaction, selecting one of at least one portion of data and at least another portion of data as a representative of at least one portion of data and at least another portion of data in the received data stream, and storing each representative of a portion of data from each transaction in the plurality of transactions, wherein a plurality of representatives is configured to form a chain representing the received data stream.
摘要:
Embodiments of this invention provide primary magnetic disk data storage capacity to clients while at the same time making sure that client data is replicated locally and at an offsite location to protect from all forms of data loss.
摘要:
A system, a method and a computer program product for storing data, which include receiving a data stream having a plurality of transactions that include at least one portion of data, determining whether at least one portion of data within at least one transaction is substantially similar to at least another portion of data within at least one transaction, clustering together at least one portion of data and at least another portion of data within at least one transaction, selecting one of at least one portion of data and at least another portion of data as a representative of at least one portion of data and at least another portion of data in the received data stream, and storing each representative of a portion of data from each transaction in the plurality of transactions, wherein a plurality of representatives is configured to form a chain representing the received data stream.
摘要:
A method, a system and a computer program product for performing deduplicating data. A data stream having a plurality of data zones is received. One or more data storage locations in a plurality of data storage locations for deduplicating one or more zones in the plurality of zones is identified. Each data storage location stores its respective deduplicated data zones. A data storage location for deduplicating a first data zone is selected. The first data zone is deduplicated using the selected data storage location.
摘要:
A method, a system, and a computer-implemented method for performing multi-level deduplication of data are disclosed. A zone stamp is generated for each zone in a plurality of zones contained in at least one data stream. The zone stamp is compared to another zone stamp. The zone stamp and another zone stamp represent zones in the plurality of zones. The comparison is performed for zones at corresponding zone levels based on a determination that a zone stamp of a zone of a preceding zone level is not similar to another zone stamp of another preceding zone level. The zone at the preceding zone level includes at least one zone of a next zone level having a size smaller than or equal to a size of the zone of the preceding zone level. The zone and another zone are deduplicated based on a determination that the zone stamp is similar to another zone stamp.
摘要:
A system, a method, and a computer program product for performing deduplication of data using a scalable deduplication grid are disclosed. A listing of a plurality of zone stamps is generated, where each zone stamp represents a zone in the plurality of zones in a data stream. The listing contains a logical arrangement of the plurality of zone stamps obtained from each storage location and being accessible by a plurality of servers. A first zone stamp in the listing is compared to a second zone stamp in the listing. The first and second zones are delta-compressed based on a determination that the first zone stamp is substantially similar to the second zone stamp. A server is selected to perform the comparison and delta-compression.