Abstract:
A method of performing a global deduplication may include: collecting a data chunk to be written to a backing storage of a storage system at a staging area in the storage system; generating a data fingerprint of the data chunk; sending the data fingerprint in batch along with other data fingerprints corresponding to data chunks collected at different times to a metadata server system in the storage system; receiving an indication, at the staging area, of whether the data fingerprint is unique in the storage system from the metadata server system; and discarding the data chunk when committing a data object containing the data chunk to the backing storage, when the indication indicates that the data chunk is not unique.
Abstract:
A distributed object store in a network storage system uses location-independent global object identifiers (IDs) for stored data objects. The global object ID enables a data object to be seamlessly moved from one location to another without affecting clients of the storage system, i.e., “transparent migration”. The global object ID can be part of a multilevel object handle, which also can include a location ID indicating the specific location at which the data object is stored, and a policy ID identifying a set of data management policies associated with the data object. The policy ID may be associated with the data object by a client of the storage system, for example when the client creates the object, thus allowing “inline” policy management. An object location subsystem (OLS) can be used to locate an object when a client request does not contain a valid location ID for the object.
Abstract:
A distributed object store in a network storage system uses location-independent global object identifiers (IDs) for stored data objects. The global object ID enables a data object to be seamlessly moved from one location to another without affecting clients of the storage system, i.e., “transparent migration”. The global object ID can be part of a multilevel object handle, which also can include a location ID indicating the specific location at which the data object is stored, and a policy ID identifying a set of data management policies associated with the data object. The policy ID may be associated with the data object by a client of the storage system, for example when the client creates the object, thus allowing “inline” policy management. An object location subsystem (OLS) can be used to locate an object when a client request does not contain a valid location ID for the object.