System performing data deduplication using a dense tree data structure
Abstract:
In one embodiment, as new blocks of data are written to storage devices of a storage system, fingerprints are generated for those new blocks and inserted as entries into a top level (L0) of a dense tree data structure. When L0 is filled, the contents from L0 may be merged with level 1 (L1). After the initial merge, new fingerprints are added to L0 until L0 fills up again, which triggers a new merge. Duplicate fingerprints in L0 and L1 are identified which, in turn, indicates duplicate data blocks. A post-processing deduplication operation is then performed to remove duplicate data blocks corresponding to the duplicate fingerprints. In a different embodiment, as new fingerprint entries are loaded into L0, those new fingerprints may be compared with existing fingerprints loaded into L0 and/or other levels to facilitate inline deduplication to identify duplicate fingerprints and subsequently perform the deduplication operation.
Information query
Patent Agency Ranking
0/0