摘要:
A classifier is calibrated to produce a calibration map and a threshold is derived from the calibration map. A probability assignment produced by the classifier for input data is then compared to the threshold.
摘要:
Metrics for a computer resource are collected. A signature representing a state of the computer resource from the metrics are determined by determining raw values for each of the metrics and generating a vector from at least some of the raw values for the metrics, where generating the vector further comprises generating models for possible system states of the computer resource, determining a model that closely matches a state of the computer resource, determining key metrics for the model, and determining a vector of values from the key metrics. An annotation that describes the state of the computer resource is received and associated with the signature. The signature and the associated annotation are stored such that they are searchable.
摘要:
Power consumption of computing devices are monitored with performance counters and used to generate a power model for each computing device. The power models are used to estimate the power consumption of each computing device based on the performance counters. Each computing device is assigned a power cap, and a software-based power control at each computing device monitors the performance counters, estimates the power consumption using the performance counters and the model, and compares the estimated power consumption with the power cap. Depending on whether the estimated power consumption violates the power cap, the power control may transition the computing device to a lower power state to prevent a violation of the power cap or a higher power state if the computing device is below the power cap.
摘要:
In a distributed system a plurality of devices (including computing units, storage and communication units) are monitored by an automated repair service that uses sensors and performs one or more repair actions on computing devices that are found to fail according to repair policies. The repair actions include automated repair actions and non-automated repair actions. The health of the computing devices is recorded in the form of states along with the repair actions that were performed on the computing devices and the times at which the repair actions were performed, and events generated by both sensors and the devices themselves. After some period of the time, the history of states of each device, the events, and the repair actions performed on the computing devices are analyzed to determine the effectiveness of the repair actions. A statistical analysis is performed based on the cost of each repair action and the determined effectiveness of each repair action, and one or more of the policies may be adjusted, as well as determining from the signals and events from the sensors whether the sensors themselves require adjustment
摘要:
Mechanisms are disclosed for incorporating prototype information into probabilistic models for automated information processing, mining, and knowledge discovery. Examples of these models include Hidden Markov Models (HMMs), Latent Dirichlet Allocation (LDA) models, and the like. The prototype information injects prior knowledge to such models, thereby rendering them more accurate, effective, and efficient. For instance, in the context of automated word labeling, additional knowledge is encoded into the models by providing a small set of prototypical words for each possible label. The net result is that words in a given corpus are labeled and are therefore in condition to be summarized, identified, classified, clustered, and the like.
摘要:
An embodiment of a method of predicting response time for a storage request begins with a first step of a computing entity storing a training data set. The training data set comprises past performance observations for past storage requests of a storage array. Each past performance observation comprises an observed response time and a feature vector for a particular past storage request. The feature vector includes characteristics that are available external to the storage array. In a second step, the computing entity forms a response time forecaster from the training data set. In the third step, the computing entity applies the response time forecaster to a pending feature vector for a pending storage request to obtain a predicted response time for the pending storage request.
摘要:
Systems, methods, and software used in performing automated diagnosis and identification of or forecasting service level object states. Some embodiments include building classifier models based on collected metric data to detect and forecast service level objective (SLO) violations. Some such systems, methods, and software further include automated detecting and forecasting of SLO violations along with providing alarms, messages, or commands to administrators or system components. Some such messages include diagnostic information with regard to a cause of a SLO violation. Some embodiments further include storing data representative of system performance and detected and forecast system SLO states. This data can then be used to generate reports of system performance including representations of system SLO states.
摘要:
A method of determining behavior of an information system application is provided. The information system application's behavior for user content requests and load conditions is determined as is a user's quality of service objectives. The information system application's capacity allocation is then prioritized. Changes in the information system application's behavior are detected. The behavior of the information system applications is then updated in response to detecting changes that affect the user's quality of service objectives.
摘要:
A computer system includes a signature creation engine operable to determine signatures representing states of a computer resource from metrics for the computer resource. The computer system also includes a database operable to store the signatures along with an annotation for each signature including information relating to a state of the computer resource. The computer system is operable to determine a recurrent problem of the computer resource from stored signatures.
摘要:
Power consumption of computing devices are monitored with performance counters and used to generate a power model for each computing device. The power models are used to estimate the power consumption of each computing device based on the performance counters. Each computing device is assigned a power cap, and a software-based power control at each computing device monitors the performance counters, estimates the power consumption using the performance counters and the model, and compares the estimated power consumption with the power cap. Depending on whether the estimated power consumption violates the power cap, the power control may transition the computing device to a lower power state to prevent a violation of the power cap or a higher power state if the computing device is below the power cap.