Abstract:
A system and method for performing a backup operation is described. A source system determines a set of files to be backed up at a backup system. Based on one or more attributes of each file of the set of files, the source system determines an order in which to perform the backup operation for the set of files. The order specifies an individual file of the set of files to be backed up before another file of the set of files. The source system communicates with the backup system to perform the backup operation of the set of files in the determined order.
Abstract:
A method, non-transitory computer readable medium, and device that prefetchs includes identifying a candidate data block from one of one or more immediate successor data blocks. The identified candidate data block has a historical access probability value from an initial accessed data block which is higher than a historical access probability value for each of the other immediate successor data blocks and is above a prefetch threshold value. The identifying is repeated until a next identified candidate data block has the historical access probability value below the prefetch threshold value. In the repeating, the identifying next immediate successor data blocks is from the previously identified candidate data block and the historical access probability value for each of the next immediate successor data blocks is determined from the originally accessed data block. The identified candidate data block with the historical access probability value above the prefetch threshold value is fetched.
Abstract:
A system and method for global data de-duplication in a cloud storage environment utilizing a plurality of data centers is provided. Each cloud storage gateway appliance divides a data stream into a plurality of data objects and generates a content-based hash value as a key for each data object. An IMMUTABLE PUT operation is utilized to store the data object at the associated key within the cloud.
Abstract:
A method, non-transitory computer readable medium, and device that prefetchs includes identifying a candidate data block from one of one or more immediate successor data blocks. The identified candidate data block has a historical access probability value from an initial accessed data block which is higher than a historical access probability value for each of the other immediate successor data blocks and is above a prefetch threshold value. The identifying is repeated until a next identified candidate data block has the historical access probability value below the prefetch threshold value. In the repeating, the identifying next immediate successor data blocks is from the previously identified candidate data block and the historical access probability value for each of the next immediate successor data blocks is determined from the originally accessed data block. The identified candidate data block with the historical access probability value above the prefetch threshold value is fetched.
Abstract:
A system and method for global data de-duplication in a cloud storage environment utilizing a plurality of data centers is provided. Each cloud storage gateway appliance divides a data stream into a plurality of data objects and generates a content-based hash value as a key for each data object. An IMMUTABLE PUT operation is utilized to store the data object at the associated key within the cloud.
Abstract:
The techniques introduced herein provide for systems and methods for estimating the effectiveness of utilizing a data deduplication process. More specifically, a content-based sampling approach for data deduplication estimation is described in which a subset of the scanned fingerprints of a dataset are included in a content-based sample that is used to determine an accurate deduplication estimate for a dataset (or volume).
Abstract:
A system and method for global data de-duplication in a cloud storage environment utilizing a plurality of data centers is provided. Each cloud storage gateway appliance divides a data stream into a plurality of data objects and generates a content-based hash value as a key for each data object. An IMMUTABLE PUT operation is utilized to store the data object at the associated key within the cloud.
Abstract:
A system and method for global data de-duplication in a cloud storage environment utilizing a plurality of data centers is provided. Each cloud storage gateway appliance divides a data stream into a plurality of data objects and generates a content-based hash value as a key for each data object. An IMMUTABLE PUT operation is utilized to store the data object at the associated key within the cloud.
Abstract:
A system and method for global data de-duplication in a cloud storage environment utilizing a plurality of data centers is provided. Each cloud storage gateway appliance divides a data stream into a plurality of data objects and generates a content-based hash value as a key for each data object. An IMMUTABLE PUT operation is utilized to store the data object at the associated key within the cloud.
Abstract:
Systems and methods which provide for improved prefetching schemes for caching data in a storage network are described. In one embodiment, a dynamically adaptive prefetching mechanism based on block access history information and prior effectiveness of prefetching is provided. Embodiments may take into account prefetch efficiency; a dynamic value indicating the usefulness of past prefetches, prefetch wastage, in conjunction with prefetch resources available at any point in time, to determine the number of blocks to read-ahead during a prefetch. Such embodiments provide improvements over file-based prefetching and previous block schemes, as they provide a finer grain of control over both prefetch block selection, and the number of blocks to prefetch based on block (or block range) access history.