Abstract:
Collaborative management of shared resources is implemented by a storage server receiving, from a first resource manager, notification of a violation for a service provided by the storage server or device coupled to the storage server. The storage server further receives, from each of a plurality of resource managers, an estimated cost of taking a corrective action to mitigate the violation and selects a corrective action proposed by one of the plurality of resource managers based upon the estimated cost. The storage server directs the resource manager that proposed the selected corrective action to perform the selected corrective action.
Abstract:
Embodiments of the systems and techniques described here can leverage several insights into the nature of workload access patterns and the working-set behavior to reduce the memory overheads. As a result, various embodiments make it feasible to maintain running estimates of a workload's cacheability in current storage systems with limited resources. For example, some embodiments provide for a method comprising estimating cacheability of a workload based on a first working-set size estimate generated from the workload over a first monitoring interval. Then, based on the cacheability of the workload, a workload cache size can be determined. A cache then can be dynamically allocated (e.g., change, possibly frequently, the cache allocation for the workload when the current allocation and the desired workload cache size differ), within a storage system for example, in accordance with the workload cache size.
Abstract:
A dynamic caching technique adaptively controls copies of data blocks stored within caches (“cached copies”) of a caching layer distributed among servers of a distributed data processing system. A cache coordinator of the distributed system implements the dynamic caching technique to increase the cached copies of the data blocks to improve processing performance of the servers. Alternatively, the technique may decrease the cached copies to reduce storage capacity of the servers. The technique may increase the cached copies when it detects local and/or remote cache bottleneck conditions at the servers, a data popularity condition at the servers, or a shared storage bottleneck condition at the storage system. Otherwise, the technique may decrease the cached copies at the servers.
Abstract:
A dynamic caching technique adaptively controls copies of data blocks stored within caches (“cached copies”) of a caching layer distributed among servers of a distributed data processing system. A cache coordinator of the distributed system implements the dynamic caching technique to increase the cached copies of the data blocks to improve processing performance of the servers. Alternatively, the technique may decrease the cached copies to reduce storage capacity of the servers. The technique may increase the cached copies when it detects local and/or remote cache bottleneck conditions at the servers, a data popularity condition at the servers, or a shared storage bottleneck condition at the storage system. Otherwise, the technique may decrease the cached copies at the servers.
Abstract:
Described herein is a system and method for dynamically managing service-level objectives (SLOs) for workloads of a cluster storage system. Proposed states/solutions of the cluster may be produced and evaluated to select one that achieves the SLOs for each workload. A planner engine may produce a state tree comprising nodes, each node representing a proposed state/solution. New nodes may be added to the state tree based on new solution types that are permitted, or nodes may be removed based on a received time constraint for executing a proposed solution or a client certification of a solution. The planner engine may call an evaluation engine to evaluate proposed states, the evaluation engine using an evaluation function that considers SLO, cost, and optimization goal characteristics to produce a single evaluation value for each proposed state. The planner engine may call a modeler engine that is trained using machine learning techniques.
Abstract:
Described herein is a system and method for dynamically managing service-level objectives (SLOs) for workloads of a cluster storage system. Proposed states/solutions of the cluster may be produced and evaluated to select one that achieves the SLOs for each workload. A planner engine may produce a state tree comprising nodes, each node representing a proposed state/solution. New nodes may be added to the state tree based on new solution types that are permitted, or nodes may be removed based on a received time constraint for executing a proposed solution or a client certification of a solution. The planner engine may call an evaluation engine to evaluate proposed states, the evaluation engine using an evaluation function that considers SLO, cost, and optimization goal characteristics to produce a single evaluation value for each proposed state. The planner engine may call a modeler engine that is trained using machine learning techniques.
Abstract:
It is detected that a metric associated with a first workload has breached a first threshold. It is determined that the first workload and a second workload access the same storage resources, wherein the storage resources are associated with a storage server. It is determined that the metric is impacted by the first workload and the second workload accessing the same storage resources. A candidate solution is identifier. An estimated impact of a residual workload is determined based, at least in part, on the candidate solution. A level of caching of at least one of the first workload or the second workload is adjusted based, at least in part, on the estimated impact of the residual workload.
Abstract:
It is detected that a metric associated with a first workload has breached a first threshold. It is determined that the first workload and a second workload access the same storage resources, wherein the storage resources are associated with a storage server. It is determined that the metric is impacted by the first workload and the second workload accessing the same storage resources. A candidate solution is identifier. An estimated impact of a residual workload is determined based, at least in part, on the candidate solution. A level of caching of at least one of the first workload or the second workload is adjusted based, at least in part, on the estimated impact of the residual workload.
Abstract:
It is detected that a metric associated with a first workload has breached a first threshold. It is determined that the first workload and a second workload access the same storage resources, wherein the storage resources are associated with a storage server. It is determined that the metric is impacted by the first workload and the second workload accessing the same storage resources. A candidate solution is identifier. An estimated impact of a residual workload is determined based, at least in part, on the candidate solution. A level of caching of at least one of the first workload or the second workload is adjusted based, at least in part, on the estimated impact of the residual workload.
Abstract:
Described herein is a system and method for dynamically managing service-level objectives (SLOs) for workloads of a cluster storage system. Proposed states/solutions of the cluster may be produced and evaluated to select one that achieves the SLOs for each workload. A planner engine may produce a state tree comprising nodes, each node representing a proposed state/solution. New nodes may be added to the state tree based on new solution types that are permitted, or nodes may be removed based on a received time constraint for executing a proposed solution or a client certification of a solution. The planner engine may call an evaluation engine to evaluate proposed states, the evaluation engine using an evaluation function that considers SLO, cost, and optimization goal characteristics to produce a single evaluation value for each proposed state. The planner engine may call a modeler engine that is trained using machine learning techniques.