HASHING WITH DIFFERING HASH SIZE AND COMPRESSION SIZE
Abstract:
A system for hashing a data set by identifying a data set to deduplicate based on a hash block size and to compress based on a compression block size, where the hash block size is smaller than the compression block size, defining a set of data blocks within the data set based on the hash block size, generating a hash for each data block in the set of data blocks within the data set, deduplicating a data block in the data set based on a respective hash for the data block, and compressing the data set based on the compression block size.
Information query
Patent Agency Ranking
0/0