Abstract:
A system and method for the distributed analysis of high frequency transaction trace data to constantly categorize incoming transaction data, identify relevant transaction categories, create per-category statistical reference and current data and perform statistical tests to identify transaction categories showing overall statistically relevant performance anomalies. The relevant transaction category detection considers both the relative transaction frequency of categories compared to the overall transaction frequency and the temporal stability of a transaction category over an observation duration. The statistical data generated for the anomaly tests contains next to data describing the overall performance of transactions of a category also data describing the transaction execution context, like the number of concurrently executed transactions or transaction load during an observation period. Anomaly tests consider current and reference execution context data in addition to statistic performance data to determine if detected statistical performance anomalies should be reported.
Abstract:
A method is disclosed that estimates causal relationships between events based on heterogeneous monitoring data. The monitoring data consists in transaction tracing data, describing the execution performance of individual transactions, resource utilization measurements of infrastructure entities like processes or operating systems and network utilization measurement data. A topology model of the monitored environment describing its entities and the communication activities of these entities is incrementally created. The location of occurred events in the topology model is determined. The topology model is used in conjunction with a domain specific causality propagation knowledge base to calculate the possibility of causal relationships between events. Different causality determination mechanisms, based on the type of involved events are used to create graphs of causal related events. A set of root cause events, representing those events with greatest global impact on all other events in an event graph is calculated for each identified event graph.
Abstract:
A system and method for the distributed analysis of high frequency transaction trace data to constantly categorize incoming transaction data, identify relevant transaction categories, create per-category statistical reference and current data and perform statistical tests to identify transaction categories showing overall statistically relevant performance anomalies. The relevant transaction category detection considers both the relative transaction frequency of categories compared to the overall transaction frequency and the temporal stability of a transaction category over an observation duration. The statistical data generated for the anomaly tests contains next to data describing the overall performance of transactions of a category also data describing the transaction execution context, like the number of concurrently executed transactions or transaction load during an observation period. Anomaly tests consider current and reference execution context data in addition to statistic performance data to determine if detected statistical performance anomalies should be reported.
Abstract:
A system and method for real-time discovery and monitoring of multidimensional topology models describing structural aspects of applications and of computing infrastructure used to execute those applications is disclosed. Different types of agents are deployed to the monitored application execution infrastructure dedicated to capture specific topological aspects of the monitored system. Virtualization agents detect and monitor the virtualization structure of virtualized hardware used in the execution infrastructure, operating system agents deployed to individual operating systems monitor resource utilization, performance and communication of processes executed by the operating system and transaction agents deployed to processes participating in the execution of transactions, providing end-to-end transaction trace and monitoring data describing individual transaction executions. The monitoring and tracing data of the deployed agents contains correlation data that allows to create a topology model of the monitored system that integrates transaction execution, process execution and communication and virtualization related aspects.
Abstract:
A system and method for real-time discovery and monitoring of multidimensional topology models describing structural aspects of applications and of computing infrastructure used to execute those applications is disclosed. Different types of agents are deployed to the monitored application execution infrastructure dedicated to capture specific topological aspects of the monitored system. Virtualization agents detect and monitor the virtualization structure of virtualized hardware used in the execution infrastructure, operating system agents deployed to individual operating systems monitor resource utilization, performance and communication of processes executed by the operating system and transaction agents deployed to processes participating in the execution of transactions, providing end-to-end transaction trace and monitoring data describing individual transaction executions. The monitoring and tracing data of the deployed agents contains correlation data that allows to create a topology model of the monitored system that integrates transaction execution, process execution and communication and virtualization related aspects.
Abstract:
A system and method is disclosed for the automated identification of causal relationships between a selected set of trigger events and observed abnormal conditions in a monitored computer system. On the detection of a trigger event, a focused, recursive search for recorded abnormalities in reported measurement data, topological changes or transaction load is started to identify operating conditions that explain the trigger event. The system also receives topology data from deployed agents which is used to create and maintain a topological model of the monitored system. The topological model is used to restrict the search for causal explanations of the trigger event to elements of that have a connection or interact with the element on which the trigger event occurred. This assures that only monitoring data of elements is considered that are potentially involved in the causal chain of events that led to the trigger event.
Abstract:
A system and method is disclosed for the combined analysis of transaction execution monitoring data and a topology model created from infrastructure monitoring data of computing systems involved in the execution of the monitored transactions. Monitored communication activities of transactions are analyzed to identify intermediate processing nodes between sender and receiver side and to enrich transaction monitoring data with data describing those intermediate processing nodes. The topology model may also be improved by the combined analysis, as functionality and services provided by elements of the topology model may be derived by the involvement of those elements in the execution of monitored transactions. The result of the combined analysis is used by an automated anomaly detection and causality estimation system. The combined analysis may also reveal entities of a monitored environment that are used by transaction executions but which are not monitored.
Abstract:
A system and method is disclosed that analyzes a set of historic transaction traces to identify an optimized set of transaction clusters with the highest transaction frequency. The transaction clusters are defined according to multiple parameters describing the execution context of the analyzed transactions. The transaction clusters are described by coordinates in a multidimensional, hierarchical classification space. Descriptive statistical data is extracted from historic transactions corresponding to previously identified transaction clusters and stored as reference data. Transaction trace data from currently executed transactions is analyzed to find a best matching historic transaction cluster. The current transaction traces are grouped according to their corresponding historic transaction cluster. Statistical data is extracted from those groups of current transaction trace and statistical test are performed that compare current and historic data on a per historic transaction cluster basis to identify deviations in performance and functional behavior of current and historic transactions.
Abstract:
A system and method is disclosed for the combined analysis of transaction execution monitoring data and a topology model created from infrastructure monitoring data of computing systems involved in the execution of the monitored transactions. Monitored communication activities of transactions are analyzed to identify intermediate processing nodes between sender and receiver side and to enrich transaction monitoring data with data describing those intermediate processing nodes. The topology model may also be improved by the combined analysis, as functionality and services provided by elements of the topology model may be derived by the involvement of those elements in the execution of monitored transactions. The result of the combined analysis is used by an automated anomaly detection and causality estimation system. The combined analysis may also reveal entities of a monitored environment that are used by transaction executions but which are not monitored.
Abstract:
A system and method for the distributed analysis of high frequency transaction trace data to constantly categorize incoming transaction data, identify relevant transaction categories, create per-category statistical reference and current data and perform statistical tests to identify transaction categories showing overall statistically relevant performance anomalies. The relevant transaction category detection considers both the relative transaction frequency of categories compared to the overall transaction frequency and the temporal stability of a transaction category over an observation duration. The statistical data generated for the anomaly tests contains next to data describing the overall performance of transactions of a category also data describing the transaction execution context, like the number of concurrently executed transactions or transaction load during an observation period. Anomaly tests consider current and reference execution context data in addition to statistic performance data to determine if detected statistical performance anomalies should be reported.