摘要:
Data objects are replicated from a source storage managed by a source server to a target storage managed by a target server. A source list is built of objects at the source server to replicate to the target server. The target server is queried to obtain a target list of objects at the target server. A replication list is built indicating objects on the source list not included on the target list to transfer to the target server. For each object in the replication list, data for the object not already at the target storage is sent to the target server and metadata on the object is sent to the target server to cause the target server to include the metadata in an entry for the object in a target server replication database. An entry for the object is added to a source server replication database.
摘要:
In one aspect of the present description, in connection with storing a first deduplicated data object in a primary storage pool, described operations include determining the duration of time that the first data object has resided in the primary storage pool, and comparing the determined duration of time to a predetermined time interval. In addition, described operations include, after the determined duration of time meets or exceeds the predetermined time interval, determining if the first data object has an extent referenced by another data object, and determining whether to move the first data object from the primary storage pool to a secondary storage pool as a function of whether the first data object has an extent referenced by another data object after the determined duration of time meets or exceeds the predetermined time interval. Other features and aspects may be realized, depending upon the particular application.
摘要:
Provided are a method, system, and article of manufacture, wherein a data structure corresponding to a set of client nodes selected from a plurality of client nodes is generated. Objects from the selected set of client nodes are stored in the data structure. A determination is made that an object corresponding to a client node of the selected set of client nodes has to be stored. An additional determination is made as to whether the object has already been stored in the data structure by any client node of the selected set of client nodes. The object is stored in the data structure, in response to determining that the object has not already been stored in the data structure by any client node of the selected set of client nodes.
摘要:
A chunk index has information on chunks in a storage space referenced in objects in the storage space. The chunk index includes a reference count for each chunk indicating a number of objects in which the chunk is referenced and a reference measurement representing a level of data object references to the chunk. One chunk is selected to remove from the storage space based on a criteria applied to the reference measurements of chunks having reference counts indicating that the chunks are not referenced in one object in the storage space.
摘要:
In each of a number of passes to deduplicate a data object, a transaction is started. Where an offset into the object has previously been set, the offset is retrieved; otherwise, the offset is set to reference a beginning of the object. A portion of the object beginning at the offset is deduplicated until an end-of-transaction criterion has been satisfied. The transaction is ended to commit deduplication; where the object has not yet been completely deduplicated, the offset is moved just past where deduplication has already occurred. The object is locked during each pass; other processes cannot access the object during each pass, but can access the object between passes. Each pass is relatively short, so the length of time in which the object is inaccessible is relatively short. By comparison, deduplicating an object within a single pass prevents other processes from accessing the object for a longer time.
摘要:
Data storage services are provided for clients for backup of data objects from the clients. A data object is sent to a first location in a first storage device. A determination is made if the data object was successfully stored at the first location, and if so, meta data corresponding with the data object is stored, wherein the meta data includes first path information on a first data path of the data object to the first location. The data object is migrated from the first location to a second location in a second storage device. A determination is made if the data object was successfully stored at the second location, and if so, second path information on a second data path of the data object is added to the second location to the meta data corresponding with the data object, to update the meta data.
摘要:
A system and method for relating files in a distributed data storage environment allows for positive identification of membership of a file within a group, even in a loosely coupled environment where files are not available for comparison in real time. In disclosed embodiments, base files of a client are stored on a server and are accompanied by tokens uniquely identifying the base files. The tokens are generated on the client and may be derived from the contents of the base file using a digital signature. Each file transmitted to the server is accompanied with a token. Incremental backups may be used, and may employ file differencing. Accordingly, sub-files related to the base files may be transmitted to the server for backup. The sub-files are related to their respective base files using the tokens and are cross-linked to the base files so that any sub-files can be retrieved together with the base file from which the sub-file was derived.
摘要:
A source server maintains a replication rule specifying a condition for a replication attribute and a replication action to take if the condition with respect to the replication attribute is satisfied, wherein the replication action indicates to include or exclude the object having an attribute value for the replication attribute that satisfies the condition. For each of the objects, the replication rule is applied by determining an attribute value of the object corresponding to the replication attribute in the replication rule and determining whether the determined attribute value satisfies the condition for the replication attribute defined in the determined replication rule. The replication action on the object in response to determining that the determined attribute value satisfies the condition for the replication attribute.
摘要:
Systems and methods for retrieving data from a storage system having a plurality of storage pools are provided. The method comprises processing configurable data retrieval instructions to determine a first storage pool from which target backup data is to be retrieved, in response to a data restore request; and retrieving the target backup data from the first storage pool to satisfy the restore request. The configurable data retrieval instructions are managed by a source external to the storage system with administrative authority to change the configurable data retrieval instructions to optimize data restoration from the storage system.
摘要:
Data storage services are provided for clients for backup of data objects from the clients. A data object is sent to a first location in a first storage device. A determination is made if the data object was successfully stored at the first location, and if so, meta data corresponding with the data object is stored, wherein the meta data includes first path information on a first data path of the data object to the first location. The data object is migrated from the first location to a second location in a second storage device. A determination is made if the data object was successfully stored at the second location, and if so, second path information on a second data path of the data object is added to the second location to the meta data corresponding with the data object, to update the meta data.