Abstract:
The current document is directed to methods and systems for processing, classifying, and efficiently storing large volumes of event messages generated in modern computing systems. In a disclosed implementation, received event messages are assigned to event-message clusters based on non-parameter tokens identified within the event messages. A parsing function is generated for each cluster that is used to extract data from incoming event messages and to prepare event records from event messages that more efficiently and accessible store event information. The parsing functions also provide an alternative basis for assignment of event massages to clusters.
Abstract:
Methods and systems that detect computer system anomalies based on log file sampling are described. Computers systems generate log files that record various types of operating system and software run events in event messages. For each computer system, a sample of event messages are collected in a first time interval and a sample of event messages are collected in a recent second time interval. Methods calculate a difference between the event messages collected in the first and second time intervals. When the difference is greater than a threshold, an alert is generated. The process of repeatedly collecting a sample of event messages in a recent time interval, calculating a difference between the event messages collected in the recent and previous time intervals, comparing the difference to the threshold, and generating an alert when the threshold is violated may be executed for each computer system of a cluster of computer systems.
Abstract:
The current document is directed to methods and systems that process, classify, efficiently store, and display large volumes of event messages generated in modern computing systems. In a disclosed implementation, event messages are assigned types and transformed into event records with well-defined fields that contain field values. Recurring patterns of event messages, referred to as “transactions,” are identified within streams or sequences of time-associated event messages and streams or sequences of time-associated event records.
Abstract:
The current document is directed to methods and systems that process, classify, efficiently store, and display large volumes of event messages generated in modern computing systems. In a disclosed implementation, received event messages are assigned to event-message clusters based on non-parameter tokens identified within the event messages. A parsing function is generated for each cluster that is used to extract data from incoming event messages and to prepare event records from event messages that more efficiently and accessible store event information. The parsing functions also provide an alternative basis for assignment of event messages to clusters. Event types associated with the clusters are used for gathering information from various information sources with which to automatically annotate event messages displayed to system administrators, maintenance personnel, and other users of event messages.
Abstract:
Various examples are disclosed for transitioning usage forecasting in a computing environment. Usage of computing resources of a computing environment are forecasted using a first forecasting data model and usage measurements obtained from the computing resources. A use of the first forecasting data model in forecasting the usage is transitioned to a second forecasting data model without incurring downtime in the computing environment. After the transition, the usage of the computing resources of the computing environment is forecasted using the second forecasting data model and the usage measurements obtained from the computing resources. The second forecasting data model exponentially decays the usage measurements based on a respective time period at which the usage measurements were obtained.
Abstract:
The current document is directed to methods and systems that process, classify, efficiently store, and display large volumes of event messages generated in modern computing systems. In a disclosed implementation, received event messages are assigned to event-message clusters based on non-parameter tokens identified within the event messages. A parsing function is generated for each cluster that is used to extract data from incoming event messages and to prepare event records from event messages that more efficiently and accessible store event information. The parsing functions also provide an alternative basis for assignment of event messages to clusters. Event types associated with the clusters are used for gathering information from various information sources with which to automatically annotate event messages displayed to system administrators, maintenance personnel, and other users of event messages.
Abstract:
Various examples are disclosed for transitioning usage forecasting in a computing environment. Usage of computing resources of a computing environment are forecasted using a first forecasting data model and usage measurements obtained from the computing resources. A use of the first forecasting data model in forecasting the usage is transitioned to a second forecasting data model without incurring downtime in the computing environment. After the transition, the usage of the computing resources of the computing environment is forecasted using the second forecasting data model and the usage measurements obtained from the computing resources. The second forecasting data model exponentially decays the usage measurements based on a respective time period at which the usage measurements were obtained.
Abstract:
Computational methods and systems to detect anomalous behaving resources and objects of a distributed computing system are described. Multiple streams of metric data representing usage of various resources of the distributed computing system are sent to a management system of the distributed computing system. The management system updates a performance model based on newly received metric values of the streams of metric data. The updated performance model is used to detect changes in one or more of the streams of metric data. The changes may be an indication of anomalous behavior at resources and objects associated with the streams of metric data. An anomaly listener is notified of anomalous behavior by the resource or object when a change in one or more of the streams of metric data is detected.
Abstract:
Computational methods and systems for detecting and troubleshooting anomalous behavior in distributed applications executing in a distributed computing system are described herein. Methods and systems discover nodes comprising the application. Anomaly detection monitors the metrics associated with the nodes for anomalous behavior in order to identify an approximate point in time when anomalous behavior begins to adversely impact performance of the application. Anomaly detection also monitors logs messages associated with the nodes to detect anomalous behavior recorded in the log messages. When anomalous behavior is detected in either the metrics and/or the log messages an alert identifying the anomalous behavior is generated. Troubleshooting guides an administrator and/or application owner to investigate the root cause of the anomalous behavior. Appropriate remedial measures may be determined based on the root cause and automatically or manually executed to correct the problem.
Abstract:
Computational methods and systems that proactively manage usage of computational resources of a distributed computing system are described. A sequence of metric data representing usage of a resource is detrended to obtain a sequence of non-trendy metric data. Stochastic process models, a pulse wave model and a seasonal model of the sequence of non-trendy metric data are computed. When a forecast request is received, a sequence of forecasted metric data is computed over a forecast interval based on the estimated trend and one of the pulse wave or seasonal model that matches the periodicity of the sequence of non-trendy metric data. Alternatively, the sequence of forecasted metric data is computed based on the estimated trend and the stochastic process model with a smallest accumulated residual error. Usage of the resource by virtual objects of the distributed computing system may be adjusted based on the sequence of forecasted metric data.