Abstract:
A system and method is proposed for estimating the contribution of components of a distributed computing environment to the generation of economically relevant values, like e.g., revenue numbers. Agents are deployed to the computing environment that trace executed transactions and that monitor components used to execute those transactions. The transaction trace data also contains data about the origin/user of transactions, which may be used to group transactions corresponding to particular interactions of individual users with the monitored application into visit data. Data describing economically relevant activities of transactions, like the purchase of goods, are also observed by agents and reported in trace data. Functional dependencies described in transaction trace data and resource related dependencies derived from component monitoring data are used to identify functionality and components that contributed to the generation of business value. The generated business value is assigned to contributing components to incrementally create data describing the economic value of those components. The so generated data can be used for various business-related analyses.
Abstract:
A technology for the optimized capturing of resource file content for resources referred in recorded user interaction sequences is disclosed. Individual resource files are typically referred in multiple recorded resources, therefore it is desired to capture those resources only once and reuse them for all recorded session capturing them. As user interaction sequences are executed and captured in independently operating web-browsers, a direct coordination between recording web-browsers to avoid multiple captures of the same resource is not possible. Data about the global resource capturing and demand situation is generated on a monitoring server that receives all session recording data and transferred to session recording browsers in form of lists identifying resources that are referred in sessions but are still unresolved and should therefore be captured, and for resources that should not captured, because they are already available and capturing them again should be avoided.
Abstract:
A system and method for the estimation of the cardinality of large sets of transaction trace data is disclosed. The estimation is based on HyperLogLog data sketches that are capable to store cardinality relevant data of large sets with low and fixed memory requirements. The disclosure contains improvements to the known analysis methods for HyperLogLog data sketches that provide improved relative error behavior by eliminating a cardinality range dependent bias of the relative error. A new analysis method for HyperLogLog data structures is shown that uses maximum likelihood analysis methods on a Poisson based approximated probability model. In addition, a variant of the new analysis model is disclosed that uses multiple HyperLogLog data structured to directly provide estimation results for set operations like intersections or relative complement directly from the HyperLogLog input data.
Abstract:
Methods and technologies are disclosed for the sketch efficient estimation of large-scale multi-sets in distributed, stream-oriented environments. Sketch updates are idempotent and commutative, to support duplicate set elements and varying element sequences. They are also mergeable to support distributed sketch recording. The recording process uses stepwise approximated geometric distributions, efficiently generated from NLZ values of received sketch updates, by using only multiplications with powers of two and integer additions. Sketch registers are subdivided into a portion storing an observed max update value for the register, and a portion storing set of flag bits indicating observed next smaller update values for the register. A Max Likelihood base sketch data evaluation, based on the assumption of statistically independent sketch registers is proposed. The limited number of different probabilities created by the stepwise approximated geometric distributions leads to a Max Likelihood function with coefficients that can be calculated solely with integer arithmetic.
Abstract:
A system and method for the creation of locality sensitive hash signatures using weighted feature sets is disclosed. The disclosed methodology takes advantage of discretization mechanisms commonly used in computer systems to model the influence of the feature weights on the calculated hash signature. Pseudo random numbers required for the signature calculation are created in ascending order, which enables the signature generation mechanism to identify and avoid the unnecessary creation of pseudo random numbers to improve the performance of the signature calculation process. Further, hierarchic, tree-search like algorithms are used during the processing of signature weights to further decrease the number of required random numbers. The features of the Poisson Process model, like its ability to provide random numbers in ascending order and the split- and mergeability of Poisson Processes are used to further improve the performance of the signature calculation process.
Abstract:
A system and method for the distributed analysis of high frequency transaction trace data to constantly categorize incoming transaction data, identify relevant transaction categories, create per-category statistical reference and current data and perform statistical tests to identify transaction categories showing overall statistically relevant performance anomalies. The relevant transaction category detection considers both the relative transaction frequency of categories compared to the overall transaction frequency and the temporal stability of a transaction category over an observation duration. The statistical data generated for the anomaly tests contains next to data describing the overall performance of transactions of a category also data describing the transaction execution context, like the number of concurrently executed transactions or transaction load during an observation period. Anomaly tests consider current and reference execution context data in addition to statistic performance data to determine if detected statistical performance anomalies should be reported.
Abstract:
A method is disclosed that estimates causal relationships between events based on heterogeneous monitoring data. The monitoring data consists in transaction tracing data, describing the execution performance of individual transactions, resource utilization measurements of infrastructure entities like processes or operating systems and network utilization measurement data. A topology model of the monitored environment describing its entities and the communication activities of these entities is incrementally created. The location of occurred events in the topology model is determined. The topology model is used in conjunction with a domain specific causality propagation knowledge base to calculate the possibility of causal relationships between events. Different causality determination mechanisms, based on the type of involved events are used to create graphs of causal related events. A set of root cause events, representing those events with greatest global impact on all other events in an event graph is calculated for each identified event graph.
Abstract:
A technology for the optimized capturing of resource file content for resources referred in recorded user interaction sequences is disclosed. Individual resource files are typically referred in multiple recorded resources, therefore it is desired to capture those resources only once and reuse them for all recorded session capturing them. As user interaction sequences are executed and captured in independently operating web-browsers, a direct coordination between recording web-browsers to avoid multiple captures of the same resource is not possible. Data about the global resource capturing and demand situation is generated on a monitoring server that receives all session recording data and transferred to session recording browsers in form of lists identifying resources that are referred in sessions but are still unresolved and should therefore be captured, and for resources that should not captured, because they are already available and capturing them again should be avoided.
Abstract:
A system and method for the analysis of log data is presented. The system uses SuperMinHash based locality sensitive hash signatures to describe the similarity between log lines. Signatures are created for incoming log lines and stored in signature indexes. Later similarity queries use those indexes to improve the query performance. The SuperMinHash algorithm uses a two staged approach to determine signature values, one stage uses a first random number to calculate the index of the signature value that is to update. The two staged approach improves the accuracy of the produced similarity estimation data for small sized signatures. The two staged approach may further be used to produce random numbers that are related, e.g. each created random number may be larger than its predecessors. This relation is used to optimize the algorithm by determining and terminating when further created random numbers have no influence on the created signature.
Abstract:
A system and method is disclosed for the automated identification of causal relationships between a selected set of trigger events and observed abnormal conditions in a monitored computer system. On the detection of a trigger event, a focused, recursive search for recorded abnormalities in reported measurement data, topological changes or transaction load is started to identify operating conditions that explain the trigger event. The system also receives topology data from deployed agents which is used to create and maintain a topological model of the monitored system. The topological model is used to restrict the search for causal explanations of the trigger event to elements of that have a connection or interact with the element on which the trigger event occurred. This assures that only monitoring data of elements is considered that are potentially involved in the causal chain of events that led to the trigger event.