摘要:
Described is a technology by which access to a resource is determined by evaluating a resource label of the resource against a user claim of an access request, according to policy decoupled from the resource. The resource may be a file, and the resource label may be obtained by classifying the file into classification properties, such that a change to the file may change its resource label, thereby changing which users have access to the file. The resource label-based access evaluation may be logically combined with a conventional ACL-based access evaluation to determine whether to grant or deny access to the resource.
摘要:
The subject disclosure is directed towards partitioning a file into chunks that satisfy a chunk size restriction, such as maximum and minimum chunk sizes, using a sliding window. For file positions within the chunk size restriction, a signature representative of a window fingerprint is compared with a target pattern, with a chunk boundary candidate identified if matched. Other signatures and patterns are then checked to determine a highest ranking signature (corresponding to a lowest numbered Rule) to associate with that chunk boundary candidate, or set an actual boundary if the highest ranked signature is matched. If the maximum chunk size is reached without matching the highest ranked signature, the chunking mechanism regresses to set the boundary based on the candidate with the next highest ranked signature (if no candidates, the boundary is set at the maximum). Also described is setting chunk boundaries based upon pattern detection (e.g., runs of zeros).
摘要:
Described is a technology by which access to a resource is determined by evaluating a resource label of the resource against a user claim of an access request, according to policy decoupled from the resource. The resource may be a file, and the resource label may be obtained by classifying the file into classification properties, such that a change to the file may change its resource label, thereby changing which users have access to the file. The resource label-based access evaluation may be logically combined with a conventional ACL-based access evaluation to determine whether to grant or deny access to the resource.
摘要:
Described is a method and system by which storage reports are generated from a volume snapshot set, rather than from a live volume. A volume snapshot set includes a representation or copy of a volume at a single point in time. By scanning the snapshot, a consistent file system image is obtained. Scanning may take place by enumerating a volume's directories of files, or, when available, by accessing a file system metadata of file information (e.g., a master file table) separately maintained on the volume. With some (e.g., hardware-based) snapshot technologies, the snapshot can be transported to another computing system for scanning by that other computing system, thereby avoiding burdening a live system's resources when scanning. Accurate and consistent storage reports are thus obtained at a single point in time, independent of the number of volumes being scanned.
摘要:
Described is a storage reports duplicate file detector that operates by receiving file records during a first scan of file system metadata. The detector computes a hash based on attributes in the record, and maintains the hash value in association with information that indicates whether a hash value corresponds to more than one file. In one implementation, the information corresponds to the amount of space wasted by duplication. The information is used to determine which hash values correspond to groups of potentially duplicate files, and eliminate non-duplicates. A second scan locates file information for each of the potentially duplicate files, and the file information is then used to determine which groups of potentially duplicate files are actually duplicate files.
摘要:
Techniques for garbage collecting unused data chunks in storage are provided. According to one implementation, data chunks stored in a chunk container that are unused are identified based an analysis of one or more stream map chunks indicated as deleted. The identified data chunks are indicated as deleted. The storage space in the chunk container filled by the data chunks indicated as deleted may then be reclaimed. Techniques for selectively backing up data chunks are also provided. According to one implementation, a data chunk is received for storing in a chunk container. A backup copy of the received data chunk is stored in a backup container if the received data chunk is in a predetermined top percentage of most referenced data chunks in the chunk container and has a number of references greater than a predetermined reference threshold.
摘要:
A method and system for verifying the integrity of a storage volume. When volume verification is desired, a shadow copy of the volume is created. A verification tool operates on the shadow copy and provides a report that indicates if any errors are found in the shadow copy. If errors are found on the shadow copy, this indicates that the same errors likely still exist on the live volume. In the event of errors, a system administrator or the like may take the live volume off-line and fix the errors found.
摘要:
Hash values corresponding to a file are processed in windows to determine a minimum hash value for each window. Each window may begin at a minimum hash value determined for a previous window and end after a fixed number of hash values. If a hash value is less than a threshold hash value, it is added to a buffer that is used to store the hash values in sorted order for a current window. If a hash value is greater than the threshold, it is added to another buffer whose hash values are not stored in sorted order. At the end of the current window, the minimum hash value in the first buffer is selected as the landmark for the window. If the first buffer is empty, then the hash values in the other buffer are sorted and the minimum hash value is selected as the landmark for the window.
摘要:
Hash values corresponding to a file are processed in windows to determine a minimum hash value for each window. Each window may begin at a minimum hash value determined for a previous window and end after a fixed number of hash values. If a hash value is less than a threshold hash value, it is added to a buffer that is used to store the hash values in sorted order for a current window. If a hash value is greater than the threshold, it is added to another buffer whose hash values are not stored in sorted order. At the end of the current window, the minimum hash value in the first buffer is selected as the landmark for the window. If the first buffer is empty, then the hash values in the other buffer are sorted and the minimum hash value is selected as the landmark for the window.
摘要:
Aspects of the subject matter described herein relate to sharing volume data via shadow copies. In aspects, an active computer creates a shadow copy of a volume. The shadow copy is exposed to one or more passive computers that may read but not write to the volume. A passive computer may obtain data from the shadow copy by determining whether the data has been written to a differential area and, if so, reading it from the differential area. If the data has not been written to the differential area, the passive computer may obtain it by first reading it from the volume, then re-determining whether it has been written to the differential area, and if so, reading the data from the differential area. Otherwise, the data read from the volume corresponds to the data needed for the shadow copy.