摘要:
A system and method for performing garbage collection. A system includes a storage medium, a first table including entries which map a virtual address to locations in the storage medium, and a second table with entries which include a reverse mapping of a physical address in a data storage medium to one or more virtual addresses. A storage controller is configured to perform garbage collection. During garbage collection, the controller is configured to identify one or more entries in the second table which correspond to a segment to be garbage collected. In response to determining the first table includes a valid mapping for a virtual address included in an entry of the one of the one or more entries, the controller is configured to copy data from a first location identified in the entry to a second location in the data storage medium, and reclaim the first storage location.
摘要:
A system and method for efficiently removing duplicate data blocks at a fine-granularity from a storage array. A data storage subsystem supports multiple deduplication tables. Table entries in one deduplication table have the highest associated probability of being deduplicated. Table entries may move from one deduplication table to another as the probabilities change. Additionally, a table entry may be evicted from all deduplication tables if a corresponding estimated probability falls below a given threshold. The probabilities are based on attributes associated with a data component and attributes associated with a virtual address corresponding to a received storage access request. A strategy for searches of the multiple deduplication tables may also be determined by the attributes associated with a given storage access request.
摘要:
A system and method for efficiently removing duplicate data blocks at a fine-granularity from a storage array. A data storage subsystem supports multiple deduplication tables. Table entries in one deduplication table have the highest associated probability of being deduplicated. Table entries may move from one deduplication table to another as the probabilities change. Additionally, a table entry may be evicted from all deduplication tables if a corresponding estimated probability falls below a given threshold. The probabilities are based on attributes associated with a data component and attributes associated with a virtual address corresponding to a received storage access request. A strategy for searches of the multiple deduplication tables may also be determined by the attributes associated with a given storage access request.
摘要:
A system and method for managing multiple fingerprint tables in a deduplicating storage system. A computer system includes a storage medium, a first fingerprint table comprising a first plurality of entries, and a second fingerprint table comprising a second plurality of entries. Each of the first plurality of entries and the second plurality of entries are configured to store fingerprint related data corresponding to data stored in the storage medium. A storage controller is configured to select the first fingerprint table for storage of entries corresponding to data stored in the data storage medium that has been deemed more likely to be successfully deduplicated than other data stored in the data storage medium; and select the second fingerprint table for storage of entries corresponding to data stored in the data storage medium that has been deemed less likely to be successfully deduplicated than other data stored in the storage medium.
摘要:
A system and method for managing multiple fingerprint tables in a deduplicating storage system. A computer system includes a storage medium, a first fingerprint table comprising a first plurality of entries, and a second fingerprint table comprising a second plurality of entries. Each of the first plurality of entries and the second plurality of entries are configured to store fingerprint related data corresponding to data stored in the storage medium. A storage controller is configured to select the first fingerprint table for storage of entries corresponding to data stored in the data storage medium that has been deemed more likely to be successfully deduplicated than other data stored in the data storage medium; and select the second fingerprint table for storage of entries corresponding to data stored in the data storage medium that has been deemed less likely to be successfully deduplicated than other data stored in the storage medium.
摘要:
A system and method for maintaining a mapping table in a data storage subsystem. A data storage subsystem supports multiple mapping tables. Records within a mapping table are arranged in multiple levels which may be logically ordered by time. Each level stores pairs of a key value and a pointer value. New records are inserted in a created new (youngest) level. All levels other than the youngest may be read only. In response to detecting a flattening condition, a data storage controller is configured to identify a group of two or more adjacent levels of the plurality of levels for flattening which are logically adjacent in time. A new level is created and one or more records stored within the group are stored in the new level, in response to detecting each of the one or more records stores a unique key among keys stored within the group.
摘要:
A system, method, and computer-readable storage medium for mapping block numbers within a region to physical locations within a storage system. Block numbers are mapped within a region according to a fractal-based space-filling curve. If the region is not a 2k by 2k square, then the region is broken up into one or more 2k by 2k squares. Any remaining sub-region is centered within a 2k by 2k square, the 2k by 2k square is numbered using a fractal-based space-filling curve, and then the sub-region is renumbered by assigning numbers based on the order of the original block numbers of the sub-region.
摘要:
A system and method for maintaining a mapping table in a data storage subsystem. A data storage subsystem supports multiple mapping tables. Records within a mapping table are arranged in multiple levels which may be logically ordered by time. Each level stores pairs of a key value and a pointer value. New records are inserted in a created new (youngest) level. All levels other than the youngest may be read only. In response to detecting a flattening condition, a data storage controller is configured to identify a group of two or more adjacent levels of the plurality of levels for flattening which are logically adjacent in time. A new level is created and one or more records stored within the group are stored in the new level, in response to detecting each of the one or more records stores a unique key among keys stored within the group.