发明申请
- 专利标题: DATA DEDUPLICATION DICTIONARY SYSTEM
- 专利标题(中): 数据分类字典系统
-
申请号: US12752308申请日: 2010-04-01
-
公开(公告)号: US20110246741A1公开(公告)日: 2011-10-06
- 发明人: Robert Michael Raymond , Atiq Ahamad , John Richard Kostraba, JR. , Carl T. Madison, JR.
- 申请人: Robert Michael Raymond , Atiq Ahamad , John Richard Kostraba, JR. , Carl T. Madison, JR.
- 申请人地址: US CA Redwood Shores
- 专利权人: ORACLE INTERNATIONAL CORPORATION
- 当前专利权人: ORACLE INTERNATIONAL CORPORATION
- 当前专利权人地址: US CA Redwood Shores
- 主分类号: G06F12/10
- IPC分类号: G06F12/10 ; G06F12/00 ; G06F12/02
摘要:
A data deduplication method using a small hash digest dictionary in fast-access memory. The method includes receiving customer data, dividing the data into smaller chunks, and assigning hash values to each chunk. For each chunk, the method includes performing lookup for a duplicate chunk by accessing a small dictionary in memory with the chunk's hash value. When no entry, the small dictionary is updated to include the hash value to fill the dictionary with earliest received data. When an entry is found, the entry's hash value is compared with lookup value and if matched, reference data is returned and an entry counter is incremented. If not matched, additional accesses are attempted such as with additional indexes calculated using the hash value. Collisions may trigger an entry replacement such that some initially entered entries are replaced when determined to not be most repeating values such as based on their counter value.
公开/授权文献
- US08250325B2 Data deduplication dictionary system 公开/授权日:2012-08-21
信息查询