摘要:
A method and system for reducing storage requirements and speeding up storage operations by reducing the storage of redundant data includes receiving a request that identifies one or more data objects to which to apply a storage operation. For each data object, the storage system determines if the data object contains data that matches another data object to which the storage operation was previously applied. If the data objects do not match, then the storage system performs the storage operation in a usual manner. However, if the data objects do match, then the storage system may avoid performing the storage operation.
摘要:
A method and system for reducing storage requirements and speeding up storage operations by reducing the storage of redundant data includes receiving a request that identifies one or more data objects to which to apply a storage operation. For each data object, the storage system determines if the data object contains data that matches another data object to which the storage operation was previously applied. If the data objects do not match, then the storage system performs the storage operation in a usual manner. However, if the data objects do match, then the storage system may avoid performing the storage operation.
摘要:
A system and method for choosing a stream to transfer data is described. In some cases, the system reviews running data storage operations and chooses a data stream based on the review. In some cases, the system chooses a stream based on the load of data to be transferred.
摘要:
Described in detail herein are systems and methods for managing single instancing data. Using a single instance database and other constructs (e.g. sparse files), data density on archival media (e.g. magnetic tape) is improved, and the number of files per storage operation is reduced. According to one aspect of a method for managing single instancing data, for each storage operation, a chunk folder is created on a storage device that stores single instancing data. The chunk folder contains three files: 1) a file that contains data objects that have been single instanced; 2) a file that contains data objects that have not been eligible for single instancing; and 3) a metadata file used to track the location of data objects within the other files. A second storage operation subsequent to a first storage operation contains references to data objects in the chunk folder created by the first storage operation instead of the data objects themselves.
摘要:
Described in detail herein are systems and methods for managing single instancing data. Using a single instance database and other constructs (e.g. sparse files), data density on archival media (e.g. magnetic tape) is improved, and the number of files per storage operation is reduced. According to one aspect of a method for managing single instancing data, for each storage operation, a chunk folder is created on a storage device that stores single instancing data. The chunk folder contains three files: 1) a file that contains data objects that have been single instanced; 2) a file that contains data objects that have not been eligible for single instancing; and 3) a metadata file used to track the location of data objects within the other files. A second storage operation subsequent to a first storage operation contains references to data objects in the chunk folder created by the first storage operation instead of the data objects themselves.
摘要:
A system and method for choosing a stream to transfer data is described. In some cases, the system reviews running data storage operations and chooses a data stream based on the review. In some cases, the system chooses a stream based on the load of data to be transferred.
摘要:
A system and method for choosing a stream to transfer data is described. In some cases, the system reviews running data storage operations and chooses a data stream based on the review. In some cases, the system chooses a stream based on the load of data to be transferred.
摘要:
Described in detail herein are systems and methods for single instancing blocks of data in a data storage system. For example, the data storage system may include multiple computing devices (e.g., client computing devices) that store primary data. The data storage system may also include a secondary storage computing device, a single instance database, and one or more storage devices that store copies of the primary data (e.g., secondary copies, tertiary copies, etc.). The secondary storage computing device receives blocks of data from the computing devices and accesses the single instance database to determine whether the blocks of data are unique (meaning that no instances of the blocks of data are stored on the storage devices). If a block of data is unique, the single instance database stores it on a storage device. If not, the secondary storage computing device can avoid storing the block of data on the storage devices.
摘要:
Described in detail herein are systems and methods for single instancing blocks of data in a data storage system. For example, the data storage system may include multiple computing devices (e.g., client computing devices) that store primary data. The data storage system may also include a secondary storage computing device, a single instance database, and one or more storage devices that store copies of the primary data (e.g., secondary copies, tertiary copies, etc.). The secondary storage computing device receives blocks of data from the computing devices and accesses the single instance database to determine whether the blocks of data are unique (meaning that no instances of the blocks of data are stored on the storage devices). If a block of data is unique, the single instance database stores it on a storage device. If not, the secondary storage computing device can avoid storing the block of data on the storage devices.
摘要:
Described are systems and methods for storing a variable number of instances of data objects (e.g., 1, 2, 3, or up to N-1 instances, where N is the number of instances of the data object included in primary data) in secondary storage across a data storage network. In some examples, a system for storing a variable number of instances of data objects includes, one or more computing devices storing a set of data objects and multiple storage devices distinct from the one or more computing devices. Each of the multiple storage devices is configured to store at least a single instance of a data object. The system also includes a database configured to store information associated with the data objects. This information includes substantially unique identifiers for the data objects and, for each of the data objects, a number of instances of the data object stored on the multiple storage devices.