摘要:
A method for deduplicating and managing data blocks within a file system includes adding a deduplication identifier to each pointer pointing to a data block to indicate whether the data block is deduplicated, detecting duplicate data blocks, determining whether one of the duplicate data blocks has been deduplicated, when detected, determining that one duplicate data block is a master copy when it is determined that one duplicate data block has been deduplicated, selecting one of the duplicate data blocks to be a master copy when it is determined that the duplicate data blocks have not been deduplicated, and setting the deduplication identifier of the selected duplicate data block to indicate deduplication, and determining that the other duplicate data block is a new duplicate data block and setting the deduplication identifier of the other duplicate data block to indicate deduplication and directing the respective pointer to the master copy.
摘要:
A method and system for client backup data management and storage using virtual tape libraries (VTLs). A VTL controller executing a software method receives metadata that distinguishes among a plurality of different versions of backup data. The VTL controller determines a latest version of the backup data. The VTL controller determines a migration set of zero or more versions of the backup data. The latest version and any version included in the migration set are included in the plurality of different versions. The VTL controller determines that a storage of the latest version in a first storage medium (e.g., magnetic disk) of the VTL is complete. The VTL controller migrates the migration set to a second storage medium (e.g., magnetic tape) of the VTL if the migration set includes at least one version of the backup data.
摘要:
Various embodiments for differentiating between data and stubs pointing to a parent copy of deduplicated data are provided. Undeduplicated data is stored with a first cyclic redundancy check (CRC) seed. A stub pointing to the parent copy of the deduplicated data is stored with a second CRC seed. Subsequent to reading the deduplicated data, the first CRC seed is associated with the undeduplicated data, and the second CRC seed is associated with the stub. A CRC check is performed using one of the first and second CRC seeds. If the CRC check is positive, an I/O operation is allowed to proceed. If the CRC check is negative, an additional CRC check is performed using another one of the first and second CRC seeds.
摘要:
A method for deduplicating and managing data blocks within a file system includes adding a deduplication identifier to each pointer pointing to a data block to indicate whether the data block is deduplicated, detecting duplicate data blocks, determining whether one of the duplicate data blocks has been deduplicated, when detected, determining that one duplicate data block is a master copy when it is determined that one duplicate data block has been deduplicated, selecting one of the duplicate data blocks to be a master copy when it is determined that the duplicate data blocks have not been deduplicated, and setting the deduplication identifier of the selected duplicate data block to indicate deduplication, and determining that the other duplicate data block is a new duplicate data block and setting the deduplication identifier of the other duplicate data block to indicate deduplication and directing the respective pointer to the master copy.
摘要:
A method and system for client backup data management and storage using virtual tape libraries (VTLs). A VTL controller executing a software method receives metadata that distinguishes among a plurality of different versions of backup data. The VTL controller determines a latest version of the backup data. The VTL controller determines a migration set of zero or more versions of the backup data. The latest version and any version included in the migration set are included in the plurality of different versions. The VTL controller determines that a storage of the latest version in a first storage medium (e.g., magnetic disk) of the VTL is complete. The VTL controller migrates the migration set to a second storage medium (e.g., magnetic tape) of the VTL if the migration set includes at least one version of the backup data.
摘要:
A system of controlling tape drives within a tape drive library where a backup server utilizes client backup schedules and pending client restore requests to efficiently control the powering on and off of tape drives within a tape drive library.
摘要:
A method of controlling tape drives within a tape drive library where a backup server utilizes client backup schedules and pending client restore requests to efficiently control the powering on and off of tape drives within a tape drive library.
摘要:
Data for deduplication is received. The received data is deduplicated if selected conditions corresponding to the deduplication are satisfied, wherein the selected conditions include a deduplication ratio, a data deduplication threshold, and a data quiescence measure. Deduplication of the received data is discontinued if the selected conditions corresponding to the deduplication are not satisfied.
摘要:
A method to store data is disclosed. The method provides a plurality of data storage media, an automated data library comprising one or more data storage devices, a first plurality of storage cells, and a robotic accessor. The method further provides a storage vault comprising a second plurality of storage cells but no data storage devices. The method selects the (i)th data storage medium and sets the (i)th data state, where that (i)th data state is selected from the group consisting of online, offline, and vault. If the method sets the (i)th data state is set to online, then the method mounts that (i)th data storage medium in one of the data storage devices. If the method sets the (i)th data state to offline, then the method removeably places the (i)th data storage medium in one of the first plurality of storage cells. If the method sets the (i)th data state is set to vault, then the method places the (i)th data storage medium in one of the second plurality of storage cells.
摘要:
A data storage system includes a data storage array configured for de-duplication of duplicate data therein by: identification of a plurality of portions of data; a comparison of each portion of the data to identify duplicate data and identification of a link associated with each duplicate data; a determination of whether a Hamming link-separation-distance of the identified link is greater than twice a Hamming radius of an error correction code in the data storage system; and replacement of the duplicate data with the identified link when it is determined that the Hamming link-separation-distance is greater than twice the Hamming radius.