Methods and systems for abnormality analysis of streamed log data
    31.
    发明授权
    Methods and systems for abnormality analysis of streamed log data 有权
    流式日志数据异常分析方法与系统

    公开(公告)号:US09298538B2

    公开(公告)日:2016-03-29

    申请号:US13960611

    申请日:2013-08-06

    Applicant: VMware, Inc.

    CPC classification number: G06F11/079 G06F11/0706 G06F11/0754 G06F2201/86

    Abstract: This disclosure presents systems and methods for run-time analysis of streams of log data for abnormalities using a statistical structure of meta-data associated with the log data. The systems and methods convert a log data stream into meta-data and perform statistical analysis in order to reveal a dominant statistical pattern within the meta-data. The meta-data is represented as a graph with nodes that represent each of the different event types, which are detected in the stream along with event sources associated with the events. The systems and methods use real-time analysis to compare a portion of a current log data stream collected in an operational window with historically collected meta-data represented by a graph in order to determine the degree of abnormality of the current log data stream collected in the operational window.

    Abstract translation: 本公开提供了使用与日志数据相关联的元数据的统计结构来运行时分析用于异常的日志数据流的系统和方法。 系统和方法将日志数据流转换为元数据并执行统计分析,以显示元数据中的统计统计模式。 元数据被表示为具有表示每个不同事件类型的节点的图,该事件类型与流中与事件相关联的事件源一起检测。 系统和方法使用实时分析来比较在操作窗口中收集的当前日志数据流的一部分与由图表表示的历史收集的元数据,以便确定当前日志数据流的异常程度 操作窗口。

    METHODS AND SYSTEMS FOR RESOLVING ROOT CAUSES OF PERFORMANCE PROBLEMS WITH APPLICATIONS EXECUTING IN A DATA CENTER

    公开(公告)号:US20240020191A1

    公开(公告)日:2024-01-18

    申请号:US17864220

    申请日:2022-07-13

    Applicant: VMware, Inc.

    CPC classification number: G06F11/079 G06F11/3495 G06F11/0721

    Abstract: Automated methods and systems for resolving potential root causes of performance problems with applications executing in a data center are described. The automated methods use machine learning to train an inference model that relates event types recorded in metrics, log messages, and traces of an application to values of a key performance indicator (“KPI”) of the application. The methods use the trained inference model to determine which of the event types are important event types that relate to performance of the application. In response to detecting a run-time performance problem in the KPI, the methods determine which of the important event has a higher probability of being the potential root cause of the performance problem. A graphical user interface displays an alert that identifies the application as having the run-time performance problem, identity of the important event types, and at least one recommendation for remedying the performance problem.

    METHODS AND SYSTEMS FOR REDUCING THE STORAGE VOLUME OF LOG MESSAGES

    公开(公告)号:US20230222100A1

    公开(公告)日:2023-07-13

    申请号:US17573539

    申请日:2022-01-11

    Applicant: VMware, Inc.

    Abstract: Automated methods and systems for compressing log messages stored in a log message databased are described herein. The automated methods and systems perform lossy compression of an original set of log messages by identifying log messages that represent each of the various types of events recorded in the original set. The log messages in the original set are overwritten by corresponding representative log messages. Source coding is used to construct a source coding scheme and variable length binary codewords for each of the representative log messages. The representative log messages are replaced by the codewords, which occupies significantly less storage space than the original set. The lossy compressed set of log messages can be decompressed to obtain the representative log messages using the source coding scheme.

    Methods and systems for troubleshooting anomalous behavior in a data center

    公开(公告)号:US11184219B2

    公开(公告)日:2021-11-23

    申请号:US16742239

    申请日:2020-01-14

    Applicant: VMware, Inc.

    Abstract: Methods and systems described herein are directed to troubleshooting anomalous behavior in a data center. Anomalous behavior in an object of a data center, such as a computational resource, an application, or a virtual machine (“VM”), may be related to the behavior of other objects at different hierarchies of the data center. Methods and systems provide a graphical user interface that enables a user to select a selected metric associated with an object of the data center experiencing a performance problem. Unexpected metrics of an object topology of the data center that correspond to the performance problem are identified. A recommendation for executing remedial measures to correct the performance problem is generated based on the unexpected metrics.

    Method and subsystem that collects, stores, and monitors population metric data within a computer system

    公开(公告)号:US11050624B2

    公开(公告)日:2021-06-29

    申请号:US15195728

    申请日:2016-06-28

    Applicant: VMware, Inc.

    Abstract: The current document is directed to methods and subsystems within computing systems, including distributed computing systems, that collect, store, process, and analyze population metrics for types and classes of system components, including components of distributed applications executing within containers, virtual machines, and other execution environments. In a described implementation, a graph-like representation of the configuration and state of a computer system included aggregation nodes that collect metric data for a set of multiple object nodes and that collect metric data that represents the members of the set over a monitoring time interval. Population metrics are monitored, in certain implementations, to detect outlier members of an aggregation.

    Processes and systems for forecasting metric data and anomaly detection in a distributed computing system

    公开(公告)号:US11023353B2

    公开(公告)日:2021-06-01

    申请号:US16250831

    申请日:2019-01-17

    Applicant: VMware, Inc.

    Abstract: Computational processes and systems are directed to forecasting time series data and detection of anomalous behaving resources of a distributed computing system data. Processes and systems comprise off-line and on-line modes that accelerate the forecasting process and identification of anomalous behaving resources. In the off-line mode, recurrent neural network (“RNN”) is continuously trained using time series data associated with various resources of the distributed computing system. In the on-line mode, the latest RNN is used to forecast time series data for resources in a forecast time window and confidence bounds are computed over the forecast time window. The forecast time series data characterizes expected resource usage over the forecast time window so that usage of the resource may be adjusted. The confidence bounds may be used to detect anomalous behaving resources. Remedial measures may then be executed to correct problems indicated by the anomalous behavior.

    Methods and systems that detect and classify incidents and anomalous behavior using metric-data observations

    公开(公告)号:US10997009B2

    公开(公告)日:2021-05-04

    申请号:US16214272

    申请日:2018-12-10

    Applicant: VMware, Inc

    Abstract: The current document is directed to methods and systems for detecting the occurrences of abnormal events and operational behaviors within the distributed computer system. The currently described methods and systems continuously collect metric data from various metric-data sources, generate a sequence of metric-data observations, each metric-data observation comprising a set of temporally aligned metric data, and employ principle-component analysis to transform the metric-data observations to facilitate reduction of the dimensionality of the metric-data observations. The currently described methods and systems then employ clustering methods to identify outlying transformed-metric-data observations, accordingly label the transformed metric-data observations to generate a training dataset, and then apply one or more of various types of machine-learning techniques to the training dataset in order to generate an abnormal-observation detector that can be used to detect, in real time, abnormal metric-data observations as they are generated within the distributed computing system.

Patent Agency Ranking