Abstract:
A log structured content addressable deduplicated data storage system may be used to store deduplicated data. Data to be stored is partitioned into data segments. Each unique data segment is associated with a label. The storage system maintains a transaction log. Mutating storage operations are initiated by storing transaction records in the transaction log. Additional transaction records are stored in the log when storage operations are completed. Upon restarting an embodiment of the data storage system, the transaction records from the transaction logs are replayed to recreate the state of the data storage system. The data storage system updates file system metadata with transaction information while a storage operation associated with the file is being processed. This transaction information serves as atomically updated transaction commit points, allowing fully internally consistent snapshots of deduplicated volumes to be taken at any time.
Abstract:
A log structured content addressable deduplicated data storage system may be used to store deduplicated data. Data to be stored is partitioned into data segments. Each unique data segment is associated with a label. The storage system maintains a transaction log. Mutating storage operations are initiated by storing transaction records in the transaction log. Additional transaction records are stored in the log when storage operations are completed. Upon restarting an embodiment of the data storage system, the transaction records from the transaction logs are replayed to recreate the state of the data storage system. The data storage system updates file system metadata with transaction information while a storage operation associated with the file is being processed. This transaction information serves as atomically updated transaction commit points, allowing fully internally consistent snapshots of deduplicated volumes to be taken at any time.
Abstract:
In a network including WAN accelerators and segment-oriented file servers, a method comprises responding to a client request to manipulate a file via a network file protocol by receiving a first request at a first WAN accelerator, wherein the request is a request to open a file located at a file server that is a segment-oriented file server, sending a local request for the file, corresponding to the first request, from the WAN accelerator to the file server, using a segment-aware network request protocol, returning at least a portion of the requested file in the form of a representation of a data map corresponding to the at least a portion of the requested file stored on the file server and using a data map for reconstruction of the requested file.
Abstract:
A spanning storage interface facilitates the use of cloud storage services by storage clients and may perform data deduplication. The spanning storage interface may include local storage for caching data from storage clients. A disaster recovery application includes at least first and second spanning storage interfaces at first and second network locations. The second spanning storage interface is provided for at least disaster recovery operations. The second spanning storage interface includes second local storage for improving data access performance. A copy of the local cache of the first spanning storage interface is transferred to the second local storage while the first network location is operating. In the event of a disaster affecting the first network location, the second spanning storage interface can provide data access to the first network location's data with improved performance from using the copy of local cache in the second local storage.
Abstract:
Synthetic backups are created without accessing previous backup data or retrieving backup data from a cloud storage service. A backup system provides two or more backup data sets to a cloud spanning storage interface for storage in deduplicated form as label maps and data segments in a cloud storage service. A specification defines portions of two or more previous backup data sets to be copied into the synthetic backup. Labels corresponding with the specified portions of previous backup data sets are identified and added to a new label map to create a deduplicated synthetic backup. The completed label map is transferred to the cloud storage service. To provide access to the synthetic backup, the cloud spanning storage interface reconstructs all or a portion of the synthetic backup from the new label map and the data segments created during deduplication of previous backup data sets.
Abstract:
A spanning storage interface facilitates the use of cloud storage services by storage clients. The spanning storage interface presents one or more data interfaces to storage clients at a network location, such as file, object, data backup, archival, and storage block based interfaces. The data interfaces allows storage clients to store and retrieve data using non-cloud based protocols. The spanning storage interface may perform data deduplication on data received from storage clients. The spanning storage interface may transfer the deduplicated version of the data to the cloud storage service. The spanning storage interface may include local storage for storing a copy or all or a portion of the data from storage clients. The local storage may be used as a local cache of frequently accessed data, which may be stored data in its deduplicated form.
Abstract:
A data virtualization storage appliance performs data deduplication transformations on the data. The original or non-deduplicated file system is used as shell to hold the directory/file hierarchy and file metadata. The data of the file system is stored by a separate data storage in a transformed and deduplicated form. The deduplicated data store may be implemented as one or more hidden files. The shell file system preserves the hierarchy structure and potentially the file metadata of the original, non-deduplicated file system in its original format, allowing clients to access file metadata and hierarchy information easily. The data of a file may be removed from the shell file system and replaced with a data layout that specifies the arrangement of deduplicated data segments needed to reconstruct the file data. The data layout associated with a file may be stored in a separate data stream in the shell file system.