摘要:
Techniques and mechanisms provide a storage optimization manager. Data may be optimized and maintained on various nodes in a cluster. Particular nodes may be overburdened while other nodes remain relatively unused. Techniques are provided to efficiently optimize data onto nodes to enhance operational efficiency. Data access requests for optimized data are monitored and managed to allow for intelligent maintenance of optimized data.
摘要:
Techniques and mechanisms are provided for migrating data blocks around a cluster during node addition and node deletion. Migration requires no downtime, as a newly added node is immediately operational while the data blocks are being moved. Blockmap files and deduplication dictionaries need not be updated.
摘要:
Mechanisms are provided for improving the efficiency of garbage collection in a deduplication system by intelligently managing storage of deduplication segments. When a duplicate segment is identified, a reference count for an already maintained segment is incremented only if the already maintained segment has the same lifecycle as the identified duplicate segment. In some instances, an already maintained segment is assumed to have the same lifecycle if it is not stale or the age is not significantly different from the age of the newly identified duplicate. If the already maintained segment is has a different lifecycle, the new segment is stored again even though duplicates are already maintained.
摘要:
Techniques and mechanisms provide a storage optimization manager. Data may be optimized and maintained on various nodes in a cluster. Particular nodes may be overburdened while other nodes remain relatively unused. Techniques are provided to efficiently optimize data onto nodes to enhance operational efficiency. Data access requests for optimized data are monitored and managed to allow for intelligent maintenance of optimized data.
摘要:
Techniques and mechanisms are provided for migrating data blocks around a cluster during node addition and node deletion. Migration requires no downtime, as a newly added node is immediately operational while the data blocks are being moved. Blockmap files and deduplication dictionaries need not be updated.
摘要:
Mechanisms are provided for improving the efficiency of garbage collection in a deduplication system by intelligently managing storage of deduplication segments. When a duplicate segment is identified, a reference count for an already maintained segment is incremented only if the already maintained segment has the same lifecycle as the identified duplicate segment. In some instances, an already maintained segment is assumed to have the same lifecycle if it is not stale or the age is not significantly different from the age of the newly identified duplicate. If the already maintained segment is has a different lifecycle, the new segment is stored again even though duplicates are already maintained.
摘要:
Mechanisms are provided for optimizing files while allowing application servers access to metadata associated with preoptimized versions of the files. During file optimization involving compression and/or compaction, file metadata changes. In order to allow file optimization in a manner transparent to application servers, the metadata associated with preoptimized versions of the files is maintained in a metadata database as well as in an optimized version of the files themselves.
摘要:
Mechanisms are provided for optimizing files while allowing application servers access to metadata associated with preoptimized versions of the files. During file optimization involving compression and/or compaction, file metadata changes. In order to allow file optimization in a manner transparent to application servers, the metadata associated with preoptimized versions of the files is maintained in a metadata database as well as in an optimized version of the files themselves.
摘要:
A data de-duplication system is used with network attached storage and serves to reduce data duplication and file storage costs. Techniques utilizing both symlinks and hardlinks ensure efficient deletion file/data cleanup and avoid data loss in the event of crashes.
摘要:
Mechanisms are provided for optimizing multiple files in an efficient format that allows maintenance of the original namespace. Multiple files and associated metadata are written to a suitcase file. The suitcase file includes index information for accessing compressed data associated with compacted files. A hardlink to the suitcase file includes an index number used to access the appropriate index information. A simulated link to a particular file maintains the name of the particular file prior to compaction.