METHODS AND SYSTEMS FOR APPLICATION DISCOVERY FROM LOG MESSAGES

    公开(公告)号:US20250130871A1

    公开(公告)日:2025-04-24

    申请号:US18381520

    申请日:2023-10-18

    Applicant: VMware, Inc.

    Abstract: This disclosure is directed to automated computer-implemented methods for application discovery from log messages generated by event sources of applications executing in a cloud infrastructure. The methods are executed by an operations manager that constructs a data frame of probability distributions of event types of the log messages generated by the event sources in a time period. The operations manager executes clustering techniques that are used to form clusters of the probability distributions in the data frame, where each of the clusters corresponds to one of the applications. The operations manager displays the clusters of the probability distributions in a two-dimensional map of applications in a graphical user interface that enables a user to select one of the clusters in the map of applications that corresponds to one of the applications and launch clustering of probability distributions of the user-selected cluster to discover two or more instances of the application.

    METHODS AND SYSTEMS FOR INCORPORATING USER FEEDBACK IN DISCOVERING AND CORRECTING INCIDENTS IN A DATA CENTER

    公开(公告)号:US20240419530A1

    公开(公告)日:2024-12-19

    申请号:US18336799

    申请日:2023-06-16

    Applicant: VMware, Inc.

    Abstract: Automated computer-implemented methods and systems for discovering incidents occurring with objects running in a data center and executing remedial measures that correct the incidents are described herein. The methods and systems discover clusters of alerts in a stream of alerts triggered by a stream of events occurring with objects in the data center. User feedback is used to identify alerts with related event types in each cluster of alerts that corresponds to separate incidents occurring in the data center. The methods and system compare a set of runtime alerts to each incident to determine one or more similar incidents to the set of runtime alerts. The one or more similar incidents and corresponding remedial measures are displayed in a GUI with each remedial measure selectable to launch an operation that corrects one of the problems represented by the one or more similar incidents.

    AUTOMATED METHODS AND SYSTEMS FOR TROUBLESHOOTING AND OPTIMIZING PERFORMANCE OF APPLICATIONS RUNNING IN A DISTRIBUTED COMPUTING SYSTEM

    公开(公告)号:US20230099001A1

    公开(公告)日:2023-03-30

    申请号:US17490340

    申请日:2021-09-30

    Applicant: VMware, Inc.

    Abstract: Automated processes and systems troubleshoot and optimize performance of applications running in distributed computing systems. An automated computer-implemented processes train an inference model for an application based on metrics associated with the application and a key performance indicator (“KPI”) of the application. When a run-time performance problem is detected in run-time KPI values of KPI, the trained inference model is applied to run-time metrics and run-time KPI values to identify relevant run-time metrics that can be used to identify the root cause of the performance problem. The root cause of the performance problem can be used to generate a recommendation for correcting the performance problem. An alert identifying the root cause of the performance problem and the recommendation for correcting the performance problem are displayed on an interface of a display, thereby enabling correction of the performance problem and optimization of the application.

    METHODS AND SYSTEMS FOR INTELLIGENT SAMPLING OF APPLICATION TRACES

    公开(公告)号:US20220283924A1

    公开(公告)日:2022-09-08

    申请号:US17367490

    申请日:2021-07-05

    Applicant: VMware, Inc.

    Abstract: Computer-implemented methods and systems described herein perform intelligent sampling of application traces generated by an application. Computer-implemented methods and systems determine different sampling rates based on frequency of occurrence of trace types and/or frequency of occurrence of durations of the traces. Each sampling rate corresponds to a different trace type and/or different duration. The sampling rates for low frequency trace types and durations are larger than the sampling rates for high frequency trace types and durations. The relatively larger sampling rates for low frequency trace types and low frequency durations ensures that low frequency trace types and low frequency durations are sampled in sufficient numbers and are not passed over during sampling of the application traces. The set of sampled traces are stored in a data storage device.

    Methods and systems that estimate a degree of abnormality of a complex system

    公开(公告)号:US11055382B2

    公开(公告)日:2021-07-06

    申请号:US14701217

    申请日:2015-04-30

    Applicant: VMware, Inc.

    Abstract: Methods and systems that estimate a degree of abnormality of a complex system based on historical time-series data representative of the complex system's past behavior and using the historical degree of abnormality to determine whether or not a degree of abnormality determined from current time-series data representative of the same complex system's current behavior is worthy of attention. The time-series data may be metric data that represents behavior of a complex system as a result of successive measurements of the complex system made over time or in a time interval. A degree of abnormality represents the amount by which the time-series data violates a threshold. The larger the degree of abnormality of the current time-series data is from the historical degree of abnormality, the larger the violation of the thresholds and the greater the probability the violation in the current time-series data is worthy of attention.

    PROCESSES THAT DETERMINE STATES OF SYSTEMS OF A DISTRIBUTED COMPUTING SYSTEM

    公开(公告)号:US20200341832A1

    公开(公告)日:2020-10-29

    申请号:US16391702

    申请日:2019-04-23

    Applicant: VMware, Inc.

    Abstract: Automated processes and systems that determine a state of a complex computational system of a distributed computing system are described. The processes and systems determine outlier and normal metric values of metrics associated with a complex computational system. A total outlier metric is constructed based on the outlier and normal metric values of the metrics. Time stamps of outlier and normal total outlier metric values of the total outlier metric are labeled. Each time-stamp label identifies a normal or abnormal state of the complex computation system. One or more rules for classifying normal and abnormal states of the complex computational system are computed based on the time-stamp labels. The rules are applied to run-time metric values to determine a state of the complex computational system and generate an alert when the state is abnormal. The type of alert and corresponding abnormal state may be used to execute remedial measures.

Patent Agency Ranking