Abstract:
A system for automatically instrumenting and tracing an application program and related software components achieves a correlated tracing of the program execution. It includes tracing of endpoints that are the set of functions in the program execution path that the developers are interested. The tracing endpoints and related events become the total set of functions to be traced in the program (called instrument points). This invention automatically analyzes the program and generates such instrumentation points to enable correlated tracing. The generated set of instrumentation points addresses common questions that developers ask when they use monitoring tools.
Abstract:
Methods and systems for system maintenance include identifying patterns in heterogeneous logs. Predictive features are extracted from a set of input logs based on the identified patterns. It is determined that the predictive features indicate a future system failure using a first model. A second model is trained, based on a target sample from the predictive features and based on weights associated with a distance between the target sample and a set of samples from the predictive features, to identify one or more parameters of the second model associated with the future system failure. A system maintenance action is performed in accordance with the identified one or more parameters.
Abstract:
A computer-implemented method for automatically analyzing log contents received via a network and detecting content-level anomalies is presented. The computer-implemented method includes building a statistical model based on contents of a set of training logs and detecting, based on the set of training logs, content-level anomalies for a set of testing logs. The method further includes maintaining an index and metadata, generating attributes for fields, editing model capability to incorporate user domain knowledge, detecting anomalies using field attributes, and improving anomaly quality by using user feedback.
Abstract:
Systems and methods for implementing content-level anomaly detection for devices having limited memory are provided. At least one log content model is generated based on training log content of training logs obtained from one or more sources associated with the computer system. The at least one log content model is transformed into at least one modified log content model to limit memory usage. Anomaly detection is performed for testing log content of testing logs obtained from one or more sources associated with the computer system based on the at least one modified log content model. In response to the anomaly detection identifying one or more anomalies associated with the testing log content, the one or more anomalies are output.
Abstract:
A computer-implemented method, computer program product, and computer processing system are provided. The method includes preprocessing, by a processor, a set of heterogeneous logs by splitting each of the logs into tokens to obtain preprocessed logs. Each of the logs in the set is associated with a timestamp and textual content in one or more fields. The method further includes generating, by the processor, a set of regular expressions from the preprocessed logs. The method also includes performing, by the processor, an unsupervised parsing operation by applying the regular expressions to the preprocessed logs to obtain a set of parsed logs and a set of unparsed logs, if any. The method additionally includes storing, by the processor, the set of parsed logs in a log analytics database and the set of unparsed logs in a debugging database.
Abstract:
Systems and methods are disclosed for detecting periodic event behaviors from machine generated logging by: capturing heterogeneous log messages, each log message including a time stamp and text content with one or more fields; recognizing log formats from log messages; transforming the text content into a set of time series data, one time series for each log format; during a training phase, analyzing the set of time series data and building a category model for each periodic event type in heterogeneous logs; and during live operation, applying the category model to a stream of time series data from live heterogeneous log messages and generating a flag on a time series data point violating the category model and generating an alarm report for the corresponding log message.
Abstract:
Systems and methods for automatically generating a set of meta-parameters used to train invariant-based anomaly detectors are provided. Data is transformed into a first set of time series data and a second set of time series data. A fitness threshold search is performed on the first set of time series data to automatically generate a fitness threshold, and a time resolution search is performed on the set of second time series data to automatically generate a time resolution. A set of meta-parameters including the fitness threshold and the time resolution are sent to one or more user devices across a network to govern the training of an invariant-based anomaly detector.
Abstract:
Methods and systems for system failure diagnosis and correction include extracting syntactic patterns from a plurality of logs with heterogeneous formats. The syntactic patterns are clustered according to categories of system failure. A single semantically unique pattern is extracted for each category of system failure. The semantically unique patterns are matched to recent log information to detect a corresponding system failure. A corrective action us performed responsive to the detected system failure.
Abstract:
A method and system are provided. The method includes performing, by a logs-to-time-series converter, a logs-to-time-series conversion by transforming a plurality of heterogeneous logs into a set of time series. Each of the heterogeneous logs includes a time stamp and text portion with one or more fields. The method further includes performing, by a time-series-to-sequential-pattern converter, a time-series-to-sequential-pattern conversion by mining invariant relationships between the set of time series, and discovering sequential message patterns and association rules in the plurality of heterogeneous logs using the invariant relationships. The method also includes executing, by a processor, a set of log management applications, based on the sequential message patterns and the association rules.
Abstract:
Methods for system failure prediction include clustering log files according to structural log patterns. Feature representations of the log files are determined based on the log clusters. A likelihood of a system failure is determined based on the feature representations using a neural network. An automatic system control action is performed if the likelihood of system failure exceeds a threshold.