Abstract:
A rate matching technique may be configured to adjust a rate of cleaning of one or more selected segments of the storage array to accommodate a variable rate of incoming workload processed by a storage input/output (I/O) stack executing on one or more nodes of a cluster. An extent store layer of the storage I/O stack may clean a segment in accordance with segment cleaning which, illustratively, may be embodied as a segment cleaning process. The rate matching technique may be implemented as a feedback control mechanism configured to adjust the segment cleaning process based on the incoming workload. Components of the feedback control mechanism may include one or more weight schedulers and various accounting data structures, e.g., counters, configured to track the progress of segment cleaning and free space usage. The counters may also be used to balance the rates of segment cleaning and incoming I/O workload, which may change depending upon an incoming I/O rate. When the incoming I/O rate changes, the rate of segment cleaning may be adjusted accordingly to ensure that rates are substantially balanced.
Abstract:
A technique reduces an amount of metadata stored in a memory of a node in a cluster. An extent store layer of a storage input/output (I/O) stack executing on the node stores key-value pairs in a plurality of data structures, e.g., cuckoo hash tables, resident in the memory. The cuckoo hash table embodies metadata that describes an extent and, as such, may be organized to associate a location on disk with a value that identifies the location on disk. The value may be embodied as a locator that includes a reference count used to support deduplication functionality of the extent store layer with respect to the extent. The reference count is divided into two portions: a delta count portion stored in memory for each slot of the hash table and an overflow count portion stored on disk in a header of each extent. One bit of the delta count portion is reserved as an overflow bit that indicates whether the in-memory reference count has overflowed. Another bit of the delta count portion is reserved as a sign bit that indicates whether the value of the remaining delta count portion, which stores the “delta” of the reference count, is positive or negative. Overflow updates to the overflow count portion on disk are postponed until all of the bits of the delta count portion are consumed as negative/positive transitions.
Abstract:
A rate matching technique may be configured to adjust a rate of cleaning of one or more selected segments of the storage array to accommodate a variable rate of incoming workload processed by a storage input/output (I/O) stack executing on one or more nodes of a cluster. An extent store layer of the storage I/O stack may clean a segment in accordance with segment cleaning which, illustratively, may be embodied as a segment cleaning process. The rate matching technique may be implemented as a feedback control mechanism configured to adjust the segment cleaning process based on the incoming workload. Components of the feedback control mechanism may include one or more weight schedulers and various accounting data structures, e.g., counters, configured to track the progress of segment cleaning and free space usage. The counters may also be used to balance the rates of segment cleaning and incoming I/O workload, which may change depending upon an incoming I/O rate. When the incoming I/O rate changes, the rate of segment cleaning may be adjusted accordingly to ensure that rates are substantially balanced.
Abstract:
A technique reduces an amount of metadata stored in a memory of a node in a cluster. An extent store layer of a storage input/output (I/O) stack executing on the node stores key-value pairs in a plurality of data structures, e.g., cuckoo hash tables, resident in the memory. The cuckoo hash table embodies metadata that describes an extent and, as such, may be organized to associate a location on disk with a value that identifies the location on disk. The value may be embodied as a locator that includes a reference count used to support deduplication functionality of the extent store layer with respect to the extent. The reference count is divided into two portions: a delta count portion stored in memory for each slot of the hash table and an overflow count portion stored on disk in a header of each extent. One bit of the delta count portion is reserved as an overflow bit that indicates whether the in-memory reference count has overflowed. Another bit of the delta count portion is reserved as a sign bit that indicates whether the value of the remaining delta count portion, which stores the “delta” of the reference count, is positive or negative. Overflow updates to the overflow count portion on disk are postponed until all of the bits of the delta count portion are consumed as negative/positive transitions.
Abstract:
A rate matching technique may be configured to adjust a rate of cleaning of one or more selected segments of the storage array to accommodate a variable rate of incoming workload processed by a storage input/output (I/O) stack executing on one or more nodes of a cluster. An extent store layer of the storage I/O stack may clean a segment in accordance with segment cleaning which, illustratively, may be embodied as a segment cleaning process. The rate matching technique may be implemented as a feedback control mechanism configured to adjust the segment cleaning process based on the incoming workload. Components of the feedback control mechanism may include one or more weight schedulers and various accounting data structures, e.g., counters, configured to track the progress of segment cleaning and free space usage. The counters may also be used to balance the rates of segment cleaning and incoming I/O workload, which may change depending upon an incoming I/O rate. When the incoming I/O rate changes, the rate of segment cleaning may be adjusted accordingly to ensure that rates are substantially balanced.
Abstract:
A technique reduces an amount of metadata stored in a memory of a node in a cluster. An extent store layer of a storage input/output (I/O) stack executing on the node stores key-value pairs in a plurality of data structures, e.g., cuckoo hash tables, resident in the memory. The cuckoo hash table embodies metadata that describes an extent and, as such, may be organized to associate a location on disk with a value that identifies the location on disk. The value may be embodied as a locator that includes a reference count used to support deduplication functionality of the extent store layer with respect to the extent. The reference count is divided into two portions: a delta count portion stored in memory for each slot of the hash table and an overflow count portion stored on disk in a header of each extent. One bit of the delta count portion is reserved as an overflow bit that indicates whether the in-memory reference count has overflowed. Another bit of the delta count portion is reserved as a sign bit that indicates whether the value of the remaining delta count portion, which stores the “delta” of the reference count, is positive or negative. Overflow updates to the overflow count portion on disk are postponed until all of the bits of the delta count portion are consumed as negative/positive transitions.
Abstract:
A rate matching technique may be configured to adjust a rate of cleaning of one or more selected segments of the storage array to accommodate a variable rate of incoming workload processed by a storage input/output (I/O) stack executing on one or more nodes of a cluster. An extent store layer of the storage I/O stack may clean a segment in accordance with segment cleaning which, illustratively, may be embodied as a segment cleaning process. The rate matching technique may be implemented as a feedback control mechanism configured to adjust the segment cleaning process based on the incoming workload. Components of the feedback control mechanism may include one or more weight schedulers and various accounting data structures, e.g., counters, configured to track the progress of segment cleaning and free space usage. The counters may also be used to balance the rates of segment cleaning and incoming I/O workload, which may change depending upon an incoming I/O rate. When the incoming I/O rate changes, the rate of segment cleaning may be adjusted accordingly to ensure that rates are substantially balanced.
Abstract:
An optimized segment cleaning technique is configured to efficiently clean one or more selected portions or segments of a storage array coupled to one or more nodes of a cluster. A bottom-up approach of the segment cleaning technique is configured to read all blocks of a segment to be cleaned (i.e., an “old” segment) to locate extents stored on the SSDs of the old segment and examine extent metadata to determine whether the extents are valid and, if so, relocate the valid extents to a segment being written (i.e., a “new” segment). A top-down approach of the segment cleaning technique obviates reading of the blocks of the old segment to locate the extents and, instead, examines the extent metadata to determine the valid extents of the old segment. A hybrid approach may extend the top-down approach to include only full stripe read operations needed for relocation and reconstruction of blocks as well as retrieval of valid extents from the stripes, while also avoiding any unnecessary read operations of the bottom-down approach.
Abstract:
A technique reduces an amount of metadata stored in a memory of a node in a cluster. An extent store layer of a storage input/output (I/O) stack executing on the node stores key-value pairs in a plurality of data structures, e.g., cuckoo hash tables, resident in the memory. The cuckoo hash table embodies metadata that describes an extent and, as such, may be organized to associate a location on disk with a value that identifies the location on disk. The value may be embodied as a locator that includes a reference count used to support deduplication functionality of the extent store layer with respect to the extent. The reference count is divided into two portions: a delta count portion stored in memory for each slot of the hash table and an overflow count portion stored on disk in a header of each extent. One bit of the delta count portion is reserved as an overflow bit that indicates whether the in-memory reference count has overflowed. Another bit of the delta count portion is reserved as a sign bit that indicates whether the value of the remaining delta count portion, which stores the “delta” of the reference count, is positive or negative. Overflow updates to the overflow count portion on disk are postponed until all of the bits of the delta count portion are consumed as negative/positive transitions.
Abstract:
An optimized segment cleaning technique is configured to efficiently clean one or more selected portions or segments of a storage array coupled to one or more nodes of a cluster. A bottom-up approach of the segment cleaning technique is configured to read all blocks of a segment to be cleaned (i.e., an “old” segment) to locate extents stored on the SSDs of the old segment and examine extent metadata to determine whether the extents are valid and, if so, relocate the valid extents to a segment being written (i.e., a “new” segment). A top-down approach of the segment cleaning technique obviates reading of the blocks of the old segment to locate the extents and, instead, examines the extent metadata to determine the valid extents of the old segment. A hybrid approach may extend the top-down approach to include only full stripe read operations needed for relocation and reconstruction of blocks as well as retrieval of valid extents from the stripes, while also avoiding any unnecessary read operations of the bottom-down approach.