Abstract:
Systems and methods for restoring a file system to a point-in-time without relying on a backup. One system includes an electronic processor configured to automatically restore a file system to a specified point-in-time by (a) automatically restoring, from a recycle bin, items deleted from the file system after the point-in-time, (b) automatically deleting, from the file system, items created within the file system after the point-in-time, (c) automatically moving items moved within the file system after the point-in-time to a location within the file system associated with the point-in-time, (d) automatically deleting, from the file system, items copied within the file system after the point-in-time, (e) automatically renaming items renamed within the file system after the point-in-time to a name associated with the point-in-time, and (f) automatically restoring, from a version history, a version associated with the point-in-time for items with content modified after the point-in-time.
Abstract:
The subject disclosure is directed towards a data deduplication technology in which a hash index service's index is partitioned into subspace indexes, with less than the entire hash index service's index cached to save memory. The subspace index is accessed to determine whether a data chunk already exists or needs to be indexed and stored. The index may be divided into subspaces based on criteria associated with the data to index, such as file type, data type, time of last usage, and so on. Also described is subspace reconciliation, in which duplicate entries in subspaces are detected so as to remove entries and chunks from the deduplication system. Subspace reconciliation may be performed at off-peak time, when more system resources are available, and may be interrupted if resources are needed. Subspaces to reconcile may be based on similarity, including via similarity of signatures that each compactly represents the subspace's hashes.
Abstract:
Variety of approaches to provide partial storage of large files in distinct storage systems are described. A storage service initiates operations to provide storage of large files by determining a rapid access portion and a slow access portion of a file. The rapid access portion of the file is stored in a rapid access storage system and the slow access portion of the file (or an entirety of the file) is stored in a slow access storage system. In response to an access request to the file, the rapid access portion of the file is provided from the rapid access storage system. Next, the slow access portion of the file is retrieved from the slow access storage system to be provided while providing the rapid access portion of the file.
Abstract:
Variety of approaches to provide partial storage of large files in distinct storage systems are described. A storage service initiates operations to provide storage of large files by determining a rapid access portion and a slow access portion of a file. The rapid access portion of the file is stored in a rapid access storage system and the slow access portion of the file (or an entirety of the file) is stored in a slow access storage system. In response to an access request to the file, the rapid access portion of the file is provided from the rapid access storage system. Next, the slow access portion of the file is retrieved from the slow access storage system to be provided while providing the rapid access portion of the file.
Abstract:
The subject disclosure is directed towards a data deduplication technology in which a hash index service's index is partitioned into subspace indexes, with less than the entire hash index service's index cached to save memory. The subspace index is accessed to determine whether a data chunk already exists or needs to be indexed and stored. The index may be divided into subspaces based on criteria associated with the data to index, such as file type, data type, time of last usage, and so on. Also described is subspace reconciliation, in which duplicate entries in subspaces are detected so as to remove entries and chunks from the deduplication system. Subspace reconciliation may be performed at off-peak time, when more system resources are available, and may be interrupted if resources are needed. Subspaces to reconcile may be based on similarity, including via similarity of signatures that each compactly represents the subspace's hashes.