Abstract:
A method, a computing device, and a non-transitory machine-readable medium for allocating memory to data structures that map a first address space to a second is provided. In some embodiments, the method includes identifying, by a storage system, a pool of memory resources to allocate among a plurality of address maps. Each of the plurality of address maps includes at least one entry that maps an address in a first address space to an address in a second address space. An activity metric is determined for each of the plurality of address maps, and a portion of the pool of memory is allocated to each of the plurality of address maps based on the respective activity metric. The allocating of the portion of the memory pool to a first map may be performed in response to a merge operation being performed on the first map.
Abstract:
A system and method for recovering a dataset is provided that analyzes the dataset as it currently exists in order to determine those portions that do not need to be recovered. In some embodiments, the method includes identifying a dataset stored on a set of storage devices and corresponding to a first point in time. A request to restore the dataset to a second point in time is received, and a subset of the dataset is identified that is different between the first point in time and the second point in time. Data associated with the subset is selectively retrieved that corresponds to the second point in time, and the retrieved data is merged with the dataset stored on the set of storage devices. The two points in time may have any relationship, and in various examples, the method performs a roll-back or a roll-forward of the dataset.
Abstract:
A system and method are provided for backing up and recovering data that allows the data to be modified and backed up even while recovery is still in progress. In some embodiments, the method includes performing a data recovery procedure on a computing system. The data recovery procedure includes identifying a set of data objects stored on a recovery system; retrieving the set of data objects; and storing data of the retrieved set of data objects to at least one storage device. Data objects may be prioritized so that data that is in demand is retrieved first. Data that is modified during the data recovery procedure is tracked and backed up to an object-storage system during the data recovery procedure. In some embodiments, backing up the modified data is part of an incremental backup procedure that excludes data objects that contains only unmodified data.
Abstract:
An I/O processing stack includes a proxy that can provide processing services for access requests to initialized and uninitialized storage regions. For a write request, the proxy stores write information in a write metadata repository. If the write is requested for an address in an initialized storage region of the storage system, the proxy performs a write to the initialized region based on region information in the write I/O access request. If the write is requested for an address in an uninitialized storage region of the storage system, the proxy performs an on-demand initialization of the storage region and then performs a write to the storage region based on region information provided by the proxy.
Abstract:
A method includes: storing a first data extent on a physical medium, wherein the physical medium is divided into a plurality of storage blocks, wherein each of the storage blocks has a size that is different than a size of the first data extent, further wherein the first data extent is stored to a first block of the plurality of storage blocks; generating a descriptor for the first data extent, wherein the descriptor indicates that the first data extent starts within the first block of the plurality of blocks and indicates an offset from the beginning of the first block at which the first data extent starts; and storing the descriptor within the first block.
Abstract:
A system for tracking metadata changes and recovering from system interruptions. With host I/O, corresponding metadata incremental changes are aggregated and stored in a write-ahead log before being performed to their in-memory buffers. As those buffers are flushed, checkpoints are created and stored in the log. As the log wraps to the start, older entries are overwritten after they are freed from any remaining dependencies by newer checkpoints. If metadata entities have not created new checkpoints, they are instructed to in order to free up space for new aggregated batches and checkpoints. After an interruption, the wrap point is located in the log. From the wrap point, the log is scanned backwards to provide checkpoints to metadata entities. The log is then scanned forwards to perform changes specified by aggregated batches. The metadata entities' volatile memory states are recovered to what they were before the interruption.
Abstract:
A system and method for improving storage system performance by maintaining data integrity during bulk export to a cloud system is provided. A backup host reads a selected volume from the storage system via an I/O channel. The storage system remains online during bulk export and tracks I/O to the selected volume in a tracking log. The backup host compresses, encrypts, and calculates a checksum for each data block of the volume before writing a corresponding data object to export devices and sending a checksum data object to the cloud system. The devices are shipped to the cloud system, which imports the data objects and calculates a checksum for each. The storage system compares the imported checksums with the checksums in the checksum data object, and adds data blocks to the tracking log when errors are detected. An incremental backup is performed based on the contents of the tracking log.
Abstract:
Systems and techniques for recovering a storage array are disclosed. These systems and techniques include determining a size corresponding to a storage stripe of the storage array. Pieces assigned to the storage stripe are identified. A storage configuration corresponding to the pieces assigned to the storage stripe is detected. Ordinal information and parity information are determined corresponding to the pieces assigned to the storage stripe. The size determined corresponding to the storage stripe, identification of the pieces assigned to the storage stripe, the storage configuration, the ordinal information, and the parity information is stored in a data store to reconstruct lost or corrupted metadata corresponding to the storage array.
Abstract:
A system and method for improving storage system performance by maintaining data integrity during bulk export to a cloud system is provided. A backup host reads a selected volume from the storage system via an I/O channel. The storage system remains online during bulk export and tracks I/O to the selected volume in a tracking log. The backup host compresses, encrypts, and calculates a checksum for each data block of the volume before writing a corresponding data object to export devices and sending a checksum data object to the cloud system. The devices are shipped to the cloud system, which imports the data objects and calculates a checksum for each. The storage system compares the imported checksums with the checksums in the checksum data object, and adds data blocks to the tracking log when errors are detected. An incremental backup is performed based on the contents of the tracking log.
Abstract:
A method for mapping a first address space to a second address space is provided. In some embodiments, the method includes creating a first array of lookup entries and one or more second arrays of metadata entries for maintaining an ordering among the lookup entries using a tree structure. Each of the metadata entries includes one or more data index values identifying a corresponding one of the lookup entries by its position in the first array and one or more metadata index values identifying a corresponding one of the metadata entries by its position in one of the one or more second arrays. The method further includes receiving a request including a lookup value, traversing the tree structure to locate a lookup entry corresponding to the lookup value, and when the lookup value is located among the lookup entries, using the located lookup entry to process the request.