Abstract:
A system and method for improving storage system performance by maintaining data integrity during bulk export to a cloud system is provided. A backup host reads a selected volume from the storage system via an I/O channel. The storage system remains online during bulk export and tracks I/O to the selected volume in a tracking log. The backup host compresses, encrypts, and calculates a checksum for each data block of the volume before writing a corresponding data object to export devices and sending a checksum data object to the cloud system. The devices are shipped to the cloud system, which imports the data objects and calculates a checksum for each. The storage system compares the imported checksums with the checksums in the checksum data object, and adds data blocks to the tracking log when errors are detected. An incremental backup is performed based on the contents of the tracking log.
Abstract:
A system and method for improving storage system performance by maintaining data integrity during bulk export to a cloud system is provided. A backup host reads a selected volume from the storage system via an I/O channel. The storage system remains online during bulk export and tracks I/O to the selected volume in a tracking log. The backup host compresses, encrypts, and calculates a checksum for each data block of the volume before writing a corresponding data object to export devices and sending a checksum data object to the cloud system. The devices are shipped to the cloud system, which imports the data objects and calculates a checksum for each. The storage system compares the imported checksums with the checksums in the checksum data object, and adds data blocks to the tracking log when errors are detected. An incremental backup is performed based on the contents of the tracking log.
Abstract:
A system, method, and computer program product for a block-based backing up a storage device to an object storage service is provided. This includes the generation of a data object that encapsulates a data of a data extent. The data extent covers a block address range of the storage device. The data object is named with a base name that represents a logical block address (LBA) of the data extent. The base name is appended with an identifier that deterministically identifies a recovery point that the data object is associated with. The base name combined with the identifier represents a data object name for the data object. The named data object is then transmitted to the object storage service for backup of the data extent. At an initial backup, the full storage device is copied. In incremental backups afterwards, only those data extents that changed are backed up.
Abstract:
A system and method for recovering data backed up to an object store are provided. In some embodiments, the method includes identifying an address space of a data set to be recovered. A set of data objects stored by an object-based system is identified that corresponds to the address space and a selected recovery point. The identified set of data objects is retrieved, and data contained in the retrieved set of data objects is stored to at least one storage device at a block address determined by the retrieved set of data objects to recreate the address space. In some embodiments, the set of data objects is retrieved by providing an HTTP request and receiving the set of data objects as an HTTP response. In some embodiments, the set of data objects are retrieved based on the data objects being the target of a data transaction.
Abstract:
A system and method for managing distributed coherent datasets using a hierarchical change log is provided. In some embodiments, a distributed storage system is provided that includes a primary storage device containing a primary dataset and a mirror storage device containing a mirror dataset. The mirror dataset includes a coherent copy of the primary dataset. The distributed storage system further includes a hierarchical change log tracking a coherence state for the mirror dataset. The hierarchical change log includes a first sub-log and a second sub-log, and a block range of the first sub-log overlaps a block range of the second sub-log. The hierarchical change log may define a priority relationship between the first sub-log and the second sub-log governing the overlap. The first sub-log and the second sub-log may be independently configured and may be different in one of a representation and a block size.
Abstract:
A system, method, and computer program product for a block-based backing up a storage device to an object storage service is provided. This includes the generation of a data object that encapsulates a data of a data extent. The data extent covers a block address range of the storage device. The data object is named with a base name that represents a logical block address (LBA) of the data extent. The base name is appended with an identifier that deterministically identifies a recovery point that the data object is associated with. The base name combined with the identifier represents a data object name for the data object. The named data object is then transmitted to the object storage service for backup of the data extent. At an initial backup, the full storage device is copied. In incremental backups afterwards, only those data extents that changed are backed up.
Abstract:
A system and method for managing distributed coherent datasets using a hierarchical change log is provided. In some embodiments, a distributed storage system is provided that includes a primary storage device containing a primary dataset and a mirror storage device containing a mirror dataset. The mirror dataset includes a coherent copy of the primary dataset. The distributed storage system further includes a hierarchical change log tracking a coherence state for the mirror dataset. The hierarchical change log includes a first sub-log and a second sub-log, and a block range of the first sub-log overlaps a block range of the second sub-log. The hierarchical change log may define a priority relationship between the first sub-log and the second sub-log governing the overlap. The first sub-log and the second sub-log may be independently configured and may be different in one of a representation and a block size.
Abstract:
A system and method for improving storage system performance by maintaining data integrity during bulk export to a cloud system is provided. A backup host reads a selected volume from the storage system via an I/O channel. The storage system remains online during bulk export and tracks I/O to the selected volume in a tracking log. The backup host compresses, encrypts, and calculates a checksum for each data block of the volume before writing a corresponding data object to export devices and sending a checksum data object to the cloud system. The devices are shipped to the cloud system, which imports the data objects and calculates a checksum for each. The storage system compares the imported checksums with the checksums in the checksum data object, and adds data blocks to the tracking log when errors are detected. An incremental backup is performed based on the contents of the tracking log.
Abstract:
A system and method for recovering data backed up to an object store are provided. In some embodiments, the method includes identifying an address space of a data set to be recovered. A set of data objects stored by an object-based system is identified that corresponds to the address space and a selected recovery point. The identified set of data objects is retrieved, and data contained in the retrieved set of data objects is stored to at least one storage device at a block address determined by the retrieved set of data objects to recreate the address space. In some embodiments, the set of data objects is retrieved by providing an HTTP request and receiving the set of data objects as an HTTP response. In some embodiments, the set of data objects are retrieved based on the data objects being the target of a data transaction.
Abstract:
A system and method for improving storage system performance by maintaining data integrity during bulk export to a cloud system is provided. A backup host reads a selected volume from the storage system via an I/O channel. The storage system remains online during bulk export and tracks I/O to the selected volume in a tracking log. The backup host compresses, encrypts, and calculates a checksum for each data block of the volume before writing a corresponding data object to export devices and sending a checksum data object to the cloud system. The devices are shipped to the cloud system, which imports the data objects and calculates a checksum for each. The storage system compares the imported checksums with the checksums in the checksum data object, and adds data blocks to the tracking log when errors are detected. An incremental backup is performed based on the contents of the tracking log.