摘要:
Embodiments of the present invention provide approaches (e.g., online methods) to analyze end-to-end performance issues in a multi-tier enterprise storage system (ESS), such as a storage cloud, where data may be distributed across multiple storage components. Specifically, performance and configuration data from different storage components (e.g., nodes) is collected and analyzed to identify nodes that are becoming (or may become) performance bottlenecks. In a typical embodiment, a set of components distributed among a set of tiers of an ESS is identified. For each component, a total capacity and a current load are determined. Based on these values, a utilization of each component is determined. Comparison of the utilization with a predetermined threshold and/or analysis of historical data allows one or more components causing a bottleneck to be identified.
摘要:
Embodiments of the present invention provide approaches (e.g., online methods) to analyze end-to-end performance issues in a multi-tier enterprise storage system (ESS), such as a storage cloud, where data may be distributed across multiple storage components. Specifically, performance and configuration data from different storage components (e.g., nodes) is collected and analyzed to identify nodes that are becoming (or may become) performance bottlenecks. In a typical embodiment, a set of components distributed among a set of tiers of an ESS is identified. For each component, a total capacity and a current load are determined. Based on these values, a utilization of each component is determined. Comparison of the utilization with a predetermined threshold and/or analysis of historical data allows one or more components causing a bottleneck to be identified.
摘要:
Embodiments of the present invention provide performance isolation for storage clouds. Under one embodiment, workloads across a storage cloud architecture are grouped into clusters based on administrator or system input. A performance isolation domain is then created for each of the clusters, with each of the performance isolation domains comprising a set of data stores associated with a set of storage subsystems and a set of data paths that connect the set of data stores to a set of clients. Thereafter, performance isolation is provided among a set of layers of the performance isolation domains. Such performance isolation is provided by (among other things): pooling data stores from separate performance isolation domains into separate pools; assigning the pools to device adapters, RAID controller, and the set of storage subsystems; preventing workloads on the device adapters from exceeding capacities of the device adapters; mapping the set of data stores to a set of Input/Output (I/O) servers based on an I/O capacity and I/O load of the set of I/O servers; and/or pairing ports of the set of I/O servers with ports of the set of storage subsystems, the pairing being based upon availability, connectivity, I/O load, and I/O capacity.
摘要:
Embodiments for efficiently computing complex statistics from historical time series data are provided. A hierarchical summarization method includes receiving at least one stream of data and creating data blocks from the at least one stream of data. In another embodiment, a method for computing statistics for historical data includes accessing at least one online stream of historical data, the online stream of historical data including metadata, and creating data blocks from the at least one online stream of historical data. Each data block includes a pair of timestamps indicating a sampling start time and a sampling end time, a number of data samples spanned by the data block, a SUM(X) statistic, a SUM(XX) statistic, and a SUM(XY) statistic computed for the data samples spanned by the data block. Other methods are also presented, such as methods for efficiently and accurately calculating statistical queries regarding historical data for arbitrary time ranges, among others.
摘要:
Embodiments discussed in this disclosure provide an integrated provisioning framework that automates the process of provisioning storage resources, end-to-end, for an enterprise storage cloud environment. Such embodiments configure and orchestrate the deployment of a user's workload and, at the same time, provide optimization across a multitude of storage cloud resources. Along these lines, input is received in the form of workload requirements and configuration information for available system resources. Based on the input, a set (at least one) of storage cloud configuration plans is developed that satisfy the workload requirements. A set of scripts is then generated that orchestrate the deployment and configuration of different software and hardware components based on the plans.
摘要:
Embodiments of the present invention provide performance isolation for storage clouds. Under one embodiment, workloads across a storage cloud architecture are grouped into clusters based on administrator or system input. A performance isolation domain is then created for each of the clusters, with each of the performance isolation domains comprising a set of data stores associated with a set of storage subsystems and a set of data paths that connect the set of data stores to a set of clients. Thereafter, performance isolation is provided among a set of layers of the performance isolation domains. Such performance isolation is provided by (among other things): pooling data stores from separate performance isolation domains into separate pools; assigning the pools to device adapters, RAID controller, and the set of storage subsystems; preventing workloads on the device adapters from exceeding capacities of the device adapters; mapping the set of data stores to a set of Input/Output (I/O) servers based on an I/O capacity and I/O load of the set of I/O servers; and/or pairing ports of the set of I/O servers with ports of the set of storage subsystems, the pairing being based upon availability, connectivity, I/O load, and I/O capacity.
摘要:
Embodiments of the present invention provide an approach to provision storage resources (e.g., across an enterprise storage system (ESS) such as a general parallel file system (GPFS) or the like) for different workloads in an energy efficient manner. The system evaluates different energy profiles/workloads' energy consumption characteristics of storage devices to determine an allocation plan that reduces the energy cost (e.g., results in the lowest cost/energy consumption for handling a storage workload). In a typical embodiment, energy consumption characteristics for handling a particular storage workload will be determined. Thereafter, a type of storage device capable of handling the workload will be determined. Then, an allocation plan that results in the most efficient energy consumption for handling the workload will be developed. In general, the allocation plan is based upon the energy consumption characteristics and an energy efficiency algorithm. The energy efficiency algorithm serves to identify storage device(s) that can handle the workload in such a way as to reduce total energy consumption and, accordingly, costs. Along these lines, the energy efficiency algorithm may also consider other factors such as capacity and load of storage devices and service level agreement (SLA) terms in addition to energy costs (e.g., over times of day and/or days of week). In any event, at least one storage device can then be selected for handling the storage workload according to the allocation plan.
摘要:
Embodiments for efficiently computing complex statistics from historical time series data are provided. A hierarchical summarization method includes receiving at least one stream of data and creating data blocks from the at least one stream of data. In another embodiment, a method for computing statistics for historical data includes accessing at least one online stream of historical data, the online stream of historical data including metadata, and creating data blocks from the at least one online stream of historical data. Each data block includes a pair of timestamps indicating a sampling start time and a sampling end time, a number of data samples spanned by the data block, a SUM(X) statistic, a SUM(XX) statistic, and a SUM(XY) statistic computed for the data samples spanned by the data block. Other methods are also presented, such as methods for efficiently and accurately calculating statistical queries regarding historical data for arbitrary time ranges, among others.
摘要:
Embodiments of the present invention provide an approach to provision storage resources (e.g., across an enterprise storage system (ESS) such as a general parallel file system (GPFS) or the like) for different workloads in an energy efficient manner. The system evaluates different energy profiles/workloads' energy consumption characteristics of storage devices to determine an allocation plan that reduces the energy cost (e.g., results in the lowest cost/energy consumption for handling a storage workload). In a typical embodiment, energy consumption characteristics for handling a particular storage workload will be determined. Thereafter, a type of storage device capable of handling the workload will be determined. Then, an allocation plan that results in the most efficient energy consumption for handling the workload will be developed. In general, the allocation plan is based upon the energy consumption characteristics and an energy efficiency algorithm. The energy efficiency algorithm serves to identify storage device(s) that can handle the workload in such a way as to reduce total energy consumption and, accordingly, costs. Along these lines, the energy efficiency algorithm may also consider other factors such as capacity and load of storage devices and service level agreement (SLA) terms in addition to energy costs (e.g., over times of day and/or days of week). In any event, at least one storage device can then be selected for handling the storage workload according to the allocation plan.
摘要:
Minimizing cost chargeback in an information technology (IT) computing environment including multiple resources. One implementation involves determining time-based usage patterns and allocation statistics for a plurality of resources and associated resource workloads. Using a regression function for determining a correlation of response time with resource usages and outstanding input/output instructions for the plurality of resources. Based on the time-based usage patterns, allocation statistics and the correlation, deriving an interpolation using positive and negative integrals to minimize a difference between allocated resource values and average allocation values. Determining service level objectives (SLOs) and resource allocation for minimizing cost chargeback for the resource workloads based on the derived interpolation.