发明申请
US20110246741A1 DATA DEDUPLICATION DICTIONARY SYSTEM 有权
数据分类字典系统

DATA DEDUPLICATION DICTIONARY SYSTEM
摘要:
A data deduplication method using a small hash digest dictionary in fast-access memory. The method includes receiving customer data, dividing the data into smaller chunks, and assigning hash values to each chunk. For each chunk, the method includes performing lookup for a duplicate chunk by accessing a small dictionary in memory with the chunk's hash value. When no entry, the small dictionary is updated to include the hash value to fill the dictionary with earliest received data. When an entry is found, the entry's hash value is compared with lookup value and if matched, reference data is returned and an entry counter is incremented. If not matched, additional accesses are attempted such as with additional indexes calculated using the hash value. Collisions may trigger an entry replacement such that some initially entered entries are replaced when determined to not be most repeating values such as based on their counter value.
公开/授权文献
信息查询
0/0