Methods and systems to determine baseline event-type distributions of event sources and detect changes in behavior of event sources

    公开(公告)号:US11182267B2

    公开(公告)日:2021-11-23

    申请号:US16655883

    申请日:2019-10-17

    Applicant: VMware, Inc.

    Abstract: Automated methods and systems to determine a baseline event-type distribution of an event source and use the baseline event type distribution to detect changes in the behavior of the event source are described. In one implementation, blocks of event messages generated by the event source are collected and an event-type distribution is computed for each of block of event messages. Candidate baseline event-type distributions are determined from the event-type distributions. The candidate baseline event-type distribution has the largest entropy of the event-type distributions. A normal discrepancy radius of the event-type distributions is computed from the baseline event-type distribution and the event-type distributions. A block of run-time event messages generated by the event source is collected. A run-time event-type distribution is computed from the block of run-time event messages. When the run-time event-type distribution is outside the normal discrepancy radius, an alert is generated indicating abnormal behavior of the event source.

    METHODS AND SYSTEMS FOR TROUBLESHOOTING ANOMALOUS BEHAVIOR IN A DATA CENTER

    公开(公告)号:US20210218619A1

    公开(公告)日:2021-07-15

    申请号:US16742239

    申请日:2020-01-14

    Applicant: VMware, Inc.

    Abstract: Methods and systems described herein are directed to troubleshooting anomalous behavior in a data center. Anomalous behavior in an object of a data center, such as a computational resource, an application, or a virtual machine (“VM”), may be related to the behavior of other objects at different hierarchies of the data center. Methods and systems provide a graphical user interface that enables a user to select a selected metric associated with an object of the data center experiencing a performance problem. Unexpected metrics of an object topology of the data center that correspond to the performance problem are identified. A recommendation for executing remedial measures to correct the performance problem is generated based on the unexpected metrics.

    METHODS AND SYSTEMS THAT EFFICIENTLY STORE METRIC DATA

    公开(公告)号:US20210124665A1

    公开(公告)日:2021-04-29

    申请号:US17140065

    申请日:2021-01-02

    Applicant: VMware, Inc.

    Abstract: The current document is directed to methods and systems that collect metric data within computing facilities, including large data centers and cloud-computing facilities. In a described implementation, lower and higher metric-data-value thresholds are used to partition collected metric data into outlying metric data and inlying metric data. The inlying metric data is quantized to compress the inlying metric data and adjacent data points having the same quantized metric-data values are eliminated, to further compress the inlying metric data. The resulting compressed data includes original metric-data representations for outlier data points and compressed metric-data representations for inlier data points, providing accurate restored metric-data values for significant data points when compressed metric data is decompressed.

    Methods and systems to quantize and compress time series data

    公开(公告)号:US10713265B2

    公开(公告)日:2020-07-14

    申请号:US15627925

    申请日:2017-06-20

    Applicant: VMware, Inc.

    Abstract: Methods and systems quantize and compress time series data generated by a resource of a distributed computing system. The time series data is partitioned according to a set of quantiles. Quantized time series data is generated from the time series data and the quantiles. The quantized time series data is compressed by deleting sequential duplicate quantized data points from the quantized time series data to obtain compress time series data. Quantization and compression are performed for different combinations of quantiles. The user may choose to minimize information loss of information due to quantization while selecting a lower bound for the compression rate. Alternatively, the user may choose to maximize the compression rate while placing an upper limit on the loss of information due to quantization. The compressed time series data that satisfies the user selected optimization conditions may be used to replace the original time series data in the data-storage device.

    METHODS AND SYSTEMS THAT DETECT AND CLASSIFY INCIDENTS AND ANOMOLOUS BEHAVIOR USING METRIC-DATA OBSERVATIONS

    公开(公告)号:US20200183769A1

    公开(公告)日:2020-06-11

    申请号:US16214272

    申请日:2018-12-10

    Applicant: VMware, Inc.

    Abstract: The current document is directed to methods and systems for detecting the occurrences of abnormal events and operational behaviors within the distributed computer system. The currently described methods and systems continuously collect metric data from various metric-data sources, generate a sequence of metric-data observations, each metric-data observation comprising a set of temporally aligned metric data, and employ principle-component analysis to transform the metric-data observations to facilitate reduction of the dimensionality of the metric-data observations. The currently described methods and systems then employ clustering methods to identify outlying transformed-metric-data observations, accordingly label the transformed metric-data observations to generate a training dataset, and then apply one or more of various types of machine-learning techniques to the training dataset in order to generate an abnormal-observation detector that can be used to detect, in real time, abnormal metric-data observations as they are generated within the distributed computing system.

    PROCESSES AND SYSTEMS FOR FORECASTING METRIC DATA AND ANOMALY DETECTION IN A DISTRIBUTED COMPUTING SYSTEM

    公开(公告)号:US20200065213A1

    公开(公告)日:2020-02-27

    申请号:US16250831

    申请日:2019-01-17

    Applicant: VMware, Inc.

    Abstract: Computational processes and systems are directed to forecasting time series data and detection of anomalous behaving resources of a distributed computing system data. Processes and systems comprise off-line and on-line modes that accelerate the forecasting process and identification of anomalous behaving resources. In the off-line mode, recurrent neural network (“RNN”) is continuously trained using time series data associated with various resources of the distributed computing system. In the on-line mode, the latest RNN is used to forecast time series data for resources in a forecast time window and confidence bounds are computed over the forecast time window. The forecast time series data characterizes expected resource usage over the forecast time window so that usage of the resource may be adjusted. The confidence bounds may be used to detect anomalous behaving resources. Remedial measures may then be executed to correct problems indicated by the anomalous behavior.

    Methods and systems to identify anomalous behaving components of a distributed computing system

    公开(公告)号:US10572329B2

    公开(公告)日:2020-02-25

    申请号:US15375386

    申请日:2016-12-12

    Applicant: VMware, Inc.

    Abstract: Methods and system described herein are directed to identifying anomalous behaving components of a distributed computing system. Methods and system collect log messages generated by a set of event log source running in the distributed computing system within an observation time window. Frequencies of various types of event messages generated within the observation time window are determined for each of the log sources. A similarity value is calculated for each pair of event sources. The similarity values are used to identify similar clusters of event sources of the distributed computing system for various management purposes. Components of the distributed computing system that are used to host the event source outliers may be identified as potentially having problems or may be an indication of future problems.

    Methods and systems to detect and classify changes in a distributed computing system

    公开(公告)号:US10402253B2

    公开(公告)日:2019-09-03

    申请号:US15607944

    申请日:2017-05-30

    Applicant: VMware, Inc.

    Abstract: Methods and systems are directed to detecting and classifying changes in a distributed computing system. Divergence value are computed from distributions of different types of event messages generated in time intervals of a sliding time window. Each divergence value is a measure of change in types of events generated in each time interval. When a divergence value, or a rate of change in divergence values, exceeds a threshold, the time interval associated with the threshold violation is used to determine a change point in the operation of the distributed computing system. Based on the change point, a start time of the change is determined. The change is classified based on various previously classified change points in the disturbed computing system. A recommendation may be generated to address the change based on the classification of the change.

Patent Agency Ranking