Invention Grant
- Patent Title: System and method for estimating duplicate data
- Patent Title (中): 用于估计重复数据的系统和方法
-
Application No.: US11846033Application Date: 2007-08-28
-
Publication No.: US08793226B1Publication Date: 2014-07-29
- Inventor: Sandeep Yadav , Don Trimmer , Yong Cho
- Applicant: Sandeep Yadav , Don Trimmer , Yong Cho
- Applicant Address: US CA Sunnyvale
- Assignee: NetApp, Inc.
- Current Assignee: NetApp, Inc.
- Current Assignee Address: US CA Sunnyvale
- Agency: Cesari and McKenna, LLP
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
The present invention provides a system and method for estimating duplicate data in a storage system. A duplicate estimation application executes on a client of a storage system selects an element from an intended destination such as, e.g., a data store of the storage system. If the element is a file (or other data container), the application reads data from the file and computes a fingerprint of the read data. The computed fingerprint is then logged in a fingerprint database, which is illustratively stored on a storage device connected to the client executing the application. This process repeats until the entire file (or other data container) has been read and fingerprinted. Once all elements have been scanned, fingerprinted and recorded, the application identifies any unique entries within the fingerprint database. Utilizing this information, the application computes an estimated space savings that may be realized by employing a data de-duplication technique.
Information query