Invention Grant
- Patent Title: Probabilistic algorithm to check whether a file is unique for deduplication
-
Application No.: US16552908Application Date: 2019-08-27
-
Publication No.: US11669495B2Publication Date: 2023-06-06
- Inventor: Wenguang Wang , Junlong Gao , Marcos K. Aguilera , Richard P. Spillane , Christos Karamanolis , Maxime Austruy
- Applicant: VMware, Inc.
- Applicant Address: US CA Palo Alto
- Assignee: VMware, Inc.
- Current Assignee: VMware, Inc.
- Current Assignee Address: US CA Palo Alto
- Agency: Dinsmore & Shohl LLP
- Main IPC: G06F7/00
- IPC: G06F7/00 ; G06F16/174 ; G06F16/14

Abstract:
Disclosed techniques include deduplication. Techniques include determining whether a file is unique, and depending on whether the file is unique, deduplicating only part of the file or the entire file. The techniques include processing the first chunk of a file to determine whether the hash of the chunk hash is already within a chunk hash table, and if not, then a percentage of chunks of the file is similarly processed. If any of the hashes of chunks are already in the chunk hash table, then at least some of file has been previously deduplicated, and file is not unique the storage system. If none of the processed chunks have a hash that is already in the chunk hash table, then the file is considered to be unique within chunk store and only a partial percentage of the file's chunks are deduplicated. Not all of a unique file's chunks are deduplicated.
Public/Granted literature
- US20210064579A1 PROBABILISTIC ALGORITHM TO CHECK WHETHER A FILE IS UNIQUE FOR DEDUPLICATION Public/Granted day:2021-03-04
Information query